
THE MATHEMATICS OF 


PHYSICS AND CHEMISTRY 


BY 

HENRY MARGENAU 

Eugene Higgins Professor of Physics and Natural Philosophy 
Yale University 

and 

GEORGE MOSELEY MURPHY 

Chairman , Department of Chemistry 
Washington Square College 
New York University 

SECOND EDITION 



( An East-West Edition ) 


D. VAN NOSTRAND COMPANY, Inc. 
PRINCETON, NEW JERSEY 




120 Alexander St., Princeton, New Jersey (Principal office) 
24 West 40th Street, New York 18, New York 

D. Van Nostrand Company, Ltd. 

358, Kensington High Street, London, W.14, England 

D. Van Nostrand Company (Canada), Ltd. 

25 Hollinger Road, Toronto 16, Canada 

Copyright, ©, 1943, 1956 

BY 

D. VAN NOSTRAND COMPANY, Inc. 


Published simultaneously in Canada by 
D. Van Nostrand Company (Canada), Ltd. 

Library of Congress Catalog Card No. 55-10911 

All Rights Reserved 

This hook , or any parts thereof , may not be 
reproduced in any form without written per- 
mission from the authors'- and the publishers . 

First Published May 1943 
Fourteen Reprintings 
Second Edition, January 1956 
Reprinted June 1966, 

July 1957 , March 1959 , January 1961 , 

March 1962 

East ~ West Student Edition - 1963 
Affiliated East - West Press P. Ltd . 

Second , East - West Reprint - 1966 

Price in India Rs . 15.00 
Rest of Asia $ 3.80 
Sales territory: Asia 

Reprinted in India with the special permission of the original 
Publishers, the O. Van Nostrand Company Inc., Princeton, 
New Jersey, U.S.A. and the copyright holders. 

This book has been published with the assistance of the Joint 
Indian- American Standard Works Programme. 

Published by W.D, Ten Broeck for AFFILIATED EAST-WEST 
PRESS PRIVATE LTD., C 57 Defence Colony, New Delhi 3. 
India, and printed by S. M. Balsaver at USHA PRINTERS, 


In the second edition the main plan of the book has been left unchanged. 
Small amounts of material have been added in a great number of places, 
and improvements have been attempted at many points. It was felt that 
a lack in the first edition was its omission of the theory of Laplace and 
Fourier transforms. This has been remedied in Chapter 8. The discus- 
sion of numerical calculations, integral equations and group theory has 
likewise .been augmented by removal of unnecessary items and some 
replacements. 

Ambiguities, errors and pedagogical faults have been sought in an 
endeavor to eliminate them. If we have partly succeeded in this task, 
we owe it to a host of readers and our students who have given us the 
benefit of their advice and’ criticism. In this respect we are particularly 
grateful to many scientists at the Navy Electronics Laboratory who pre- 
pared a detailed list of errors soon after the first edition of the book ap- 
peared. A similar and very useful list of errata was sent by Professor 
Pentti Salomaa of the University of Turku, Finland, to whom we express 
our indebtedness. Dr. M. H. Greenblatt of R.C.A. suggested an im- 
provement in Chapter 12 of which we have made use. Finally , we acknowl- 
edge stimulus and aid coming from the careful work of Professors Tsugihiko 
Sato and Makoto Kuminune of Japan who, in translating the first edition, 
discovered a number of inaccuracies which have now been corrected. 

H. M. 

G. M. xM. 

Prof. N. Balakrtshnan 

Dr. (Mrs.) Jyofhi Bolahrishnan 
414, // Floor, 12ih Cross, 

Sadashiwnagur 

Bangalore-560 080 

PREFACE 

The authors’ aim has been to present, between the covers of a single 
book, those parts of mathematics which form the tools of the modern 
worker in theoretical physics and chemistry. They have endeavored to 
do this by steering a middle course between the mere recording of facts and 


New Haven , Conn. 
November, 1955 



rormmas WHICH is typical ui iianuuuuis ucaimcaw, auu. tut? puiiuciuuo 
development which characterizes treatises in special fields. Therefore, 
as far as space permitted, all results have been embedded in the logical 
texture of proofs. Occasionally, when full demonstrations are lengthy or 
not particularly illuminating with respect to the subject at hand, they 
have been omitted in favor of references to the literature. Except for the 
first chapter, which is primarily a survey, proofs have always been given 
where omission would destroy the continuity, of treatment. 

Arbitrary selection of topics has been necessary for lack of space. This 
was based partly on the authors’ opinions as to the relevance of various 
subjects, partly on the results of consultations with colleagues. The 
degree of difficulty of the treatment is such that a Senior majoring in physics 
or chemistry would be able to read most parts of the book with under- 
standing. 

While inclusion of large collections of routine problems did not seem 
conformable to the purpose of the book, the authors have felt that its 
usefulness might be augmented by two minor pedagogical devices: the 
insertion here and there of fully worked examples illustrative of the theory 
under discussion, and the dispersal, throughout the book, of special prob- 
lems confirming, and in some cases supplementing, the ideas of the text. 
Answers to the problems are usually given. 

The degree of rigor to which we have aspired is that customary in 
careful scientific demonstrations, not the lofty heights accessible to the 
pure mathematician. For this we make no apology; if the history of the 
exact sciences teaches anything it is that emphasis on extreme rigor often 
engenders sterility, and that the successful pioneer depends more on 
brilliant hunches than on the results of existence theorems. We trust, of 
course, that our effort to avoid rigor mortis has not brought us danger- 
ously close to the opposite extreme of sloppy reasoning. 

A careful attempt has been made to insure continuity of presentation 
within each chapter, and as far as possible throughout the book. The 
diversity of the subjects has made it necessary to refer occasionally to 
chapters ahead. Whenever this occurs it is done reluctantly and in order 
to avoid repetition. 

As to form, considerations of literacy have often been given secondary 
rank in favor of conciseness and brevity, and no great attempt has been 
made to disguise individual authorship by artificially uniformising the 
style. 

The authors have used the material of several of the chapters in a num- 
ber of special courses and have found its collection into a single volume 
convenient. To venture a few specific suggestions, the book, if it were 
judged favorably by mathematicians, would serve as a foundation for 



euurses ill appneu mauieiiutuius uii uie aeinui ana urau yt?tu giauuate level. 
A thorough introductory course in quantum mechanics could be based on 
chapter 2, parts of 3, 8 and 10, and chapter 11. Chapters 1, 10 and parts 
of 11 may be used in a short course which reviews thermodynamics and 
then treats statistical mechanics. Reading of chapters 4, 9, and 15 would 
prepare for an understanding of special treatments dealing with polyatomic 
molecules, and the liquid and solid state. Since ability to handle numeri- 
cal computations is very important in all branches of physics and chemistry, 
a chapter designed to familiarize the reader with all tools likely to be needed 
in such work has been included. 

The index has been made sufficiently complete so that the book can 
serve as a ready reference to definitions, theorems and proofs. Graduate 
students and scientists whose memory of specific mathematical details is 
dimmed may find it useful in review. Last, but not least, the authors 
have had in mind the adventurous student of physics and chemistry who 
wishes to improve his mathematical knowledge through self-study. 

Henry Margenau 

George M. Murphy 


New Haven , Conn. 
March, 1943 




CONTENTS 


CHAPTER PAGE 

1 THE MATHEMATICS OF THERMODYNAMICS 

1.1 Introduction 1 

1.2 Differentiation of Functions of Several Independent 

Variables 2 

1.3 Total Differentials 3 

1.4 Higher Order Differentials 5 

1.5 Implicit Functions 6 

1.6 Implicit Functions in Thermodynamics 7 

1.7 Exact Differentials and Line Integrals 8 

1.8 Exact and Inexact Differentials in Thermodynamics 8 

1.9 The Laws of Thermodynamics 11 

1.10 Systematic Derivation of Partial Thermodynamic De- 

rivatives 15 

1.11 Thermodynamic Derivatives by Method of Jacobians 17 

1.12 Properties of the Jacobian 18 

1.13 Application to Thermodynamics 20 

1.14 Thermodynamic Systems of Variable Mass 24 

1.15 The Principle of Carath^odory 26 

2 ORDINARY DIFFERENTIAL EQUATIONS 

2. 1 Preliminaries 32 

2.2 The Variables are Separable 33 

2.3 The Differential Equation Is, or Can be Made, Exact. 

Linear Equations 41 

2.4 Equations Reducible to Linear Form 44 

2.5 Homogeneous Differential Equations 45 

2.6 Note on Singular Solutions. Clairaut’s Equation. ... 47 

2.7 Linear Equations with Constant Coefficients; Right- 

Hand Member Zero 48 

2.8 Linear Equations with Constant Coefficients; Right- 

Hand Member a Function of x 53 

2.9 Other Special Forms of Second Order Differential 


T7( 


CONTENTS 


viii 

CHAPTER PAGE 

2.12 General Considerations Regarding Series Integration. 

Fuchs' Theorem 69 

2.13 Gauss' (Hypergeometric) Differential Equation 72 

2.14 Bessel's Equation 74 

2.15 Hermite's Differential Equation 76 

2.16 Laguerre's Differential Equation 77 

2.17 Mathieu's Equation 78 

2.18 Pfaff Differential Expressions and Equations 82 

3 SPECIAL FUNCTIONS 

3.1 Elements of Complex Integration 89 

3.1a Theorem of Laurent. Residues 91 

3.2 Gamma Function 93 

3.3 Legendre Polynomials 98 

3.4 Integral Properties of Legendre Polynomials 104 

3.5 Recurrence Relations between Legendre Polynomials . 105 

3.6 Associated Legendre Polynomials 106 

3.7 Addition Theorem for Legendre Polynomials 109 

3.8 Bessel Functions 113 

3.9 Hankel Functions and Summary on Bessel Functions 118 

3.10 Hermite Polynomials and Functions 121 

3.11 Laguerre Polynomials and Functions 126 

3.12 Generating Functions 132 

3. 13 Linear Dependence 132 

3.14 Schwarz’ Inequality 134 

4 VECTOR ANALYSIS 

4.1 Definition of a Vector. . 137 

4.2 Unit Vectors 140 

4.3 Addition and Subtraction of Vectors. 140 

4.4 The Scalar Product of Two Vectors. . * 141 

4.5 The Vector Product of Two Vectors 142 

4.6 Products Involving Three Vectors 146 

4.7 Differentiation of Vectors 148 

4.8 Scalar and Vector Fields 149 

4.9 The Gradient 150 

4.10 The Divergence 151 

4.11 The Curl. 152 

4.12 Composite Functions Involving V 153 

4.13 Successive Applications of V 153 


CONTENTS 


IX 


CHAPTER PAGE 

4.18 Theorem of the Divergence 159 

4. 19 Green’a Theorems 161 

4.20 Tensors 161 

4.21 Addition, Multiplication and Contraction 164 

4.22 Differentiation of Tensors 167 

4.23 Tensors and the Elastic Body 169 

5 COORDINATE SYSTEMS. VECTORS AND CURVI- 
LINEAR COORDINATES 

5.1 Curvilinear Coordinates 172 

5.2 Vector Relations in Curvilinear Coordinates 174 

5.3 Cartesian Coordinates 177 

5.4 Spherical Polar Coordinates 177 

5.5 Cylindrical Coordinates 178 

5.6 Confocal Ellipsoidal Coordinates 178 

5.7 Prolate Spheroidal Coordinates 180 

5.8 Oblate Spheroidal Coordinates 182 

5.9 Elliptic Cylindrical Coordinates 182 

5.10 Conical Coordinates 183 

5.11 Confocal Paraboloidal Coordinates 184 

5.12 Parabolic Coordinates 185 

5.13 Parabolic Cylindrical Coordinates 186 

5.14 Bipolar Coordinates 187 

5.15 Toroidal Coordinates 190 

5.16 Tensor Relations in Curvilinear Coordinates 192 

5.17 The Differential Operators in Tensor Notation 195 

6 CALCULUS OF VARIATIONS 

6.1 Single Independent and Single Dependent Variable. . 198 

6.2 Several Dependent Variables 203 

6.3 Example: Hamilton’s Principle 204 

6.4 Several Independent Variables 207 

6.5 Accessory Conditions; Lagrangian Multipliers 209 

6.6 Sohrodinger Equation 213 

6.7 Concluding Remarks 214 

7 PARTIAL DIFFERENTIAL EQUATIONS OF CLASSICAL 
PHYSICS 

7.1 General Considerations 216 

7.2 Laplace’s Equation . ... 217 

7.3 Laplace’s Equation in Two Dimensions 218 

*7 A T » 1 « XT' HPU — . “ ^ ^ rtrtA 


X 


CONTENTS 


CHAPTER PAGE 

7.6 Simple Electrostatic Potentials 224 

7.7 Conducting Sphere in the Field of a Point Charge — 226 

7.8 The Wave Equation 228 

7.9 One Dimension 231 

7. 10 Two Dimensions 231 

7.11 Three Dimensions 232 

7.12 Examples of Solutions of the Wave Equation 235 

7.13 Equation of Heat Conduction and Diffusion 237 

7.14 Example: Linear Flow of Heat 238 

7.15 Two-Dimensional Flow of Heat 240 

7.16 Heat Flow in Three Dimensions 240 

7.17 Poisson's Equation 241 

8 EIGENVALUES AND EIGENFUNCTIONS 

8.1 Simple Examples of Eigenvalue Problems 246 

8.2 Vibrating String; Fourier Analysis 247 

8.3 Vibrating Circular Membrane; Fourier-Bessel Trans- 

forms 254 

8.4 Vibrating Sphere with Fixed Surface 258 

8.5 Laplace and Related Transformations 259 

8.6 Use of Transforms in Solving Differential Equations 263 

8.7 Sturm-Liouville Theory . .. 267 

8.8 Variational Aspects of the Eigenvalue Problem 270 

8.9 Distribution of High Eigenvalues 274 

8.10 Completeness of Eigenfunctions 277 

8.11 Further Comments and Generalizations 279 

9 MECHANICS OF MOLECULES 

9.1 Introduction 282 

9.2 General Principles of Classical Mechanics 282 

9.3 The Rigid Body in Classical Mechanics 284 

9.4 Velocity, Angular Momentum, and Kinetic Energy 285 

9.5 The Eulerian Angles 286 

9.6 Absolute and Relative Velocity 289 

9.7 Motion of a Molecule 290 

9.8 The Kinetic Energy of a Molecule 292 

9.9 The Hamiltonian Form of the Kinetic Energy 293 

9.10 The Vibrational Energy of a Molecule 294 

9.11 Vibrations of a Linear Triatomic Molecule 297 

9.12 Quantum Mechanical Hamiltonian 299 

1A TV/f A r TT>Tr'T?Q A TVA A r PT> TV A 



uxiArj. jkxv jrrLVjHi 

10.4 Multiplication and Differentiation of Determinants. . . 304 

10.5 Preliminary Remarks on Matrices 305 

10.6 Combination of Matrices 306 

10.7 Special Matrices 307 

10.8 Real Linear Vector Space 311 

10.9 Linear Equations 313 

10.10 Linear Transformations 314 

10. 1 1 Equivalent Matrices 316 

10.12 Bilinear and Quadratic Forms 317 

10.13 Similarity Transformations 318 

10.14 The Characteristic Equation of a Matrix 318 

10.15 Reduction of a Matrix to Diagonal Form 319 

10.16 Congruent Transformations 322 

10.17 Orthogonal Transformations 324 

10.18 Hermitian Vector Space 328 

10.19 Hermitian Matrices 329 

10.20 Unitary Matrices 330 

10.21 Summary on Diagonalization of Matrices 331 

11 QUANTUM MECHANICS 

11.1 Introduction 383 

11.2 Definitions 335 

11.3 Postulates 337 

11.4 Orthogonality and Completeness of Eigenfunctions. . . 344 

11.5 Relative Frequencies of Measured Values 346 

11.6 Intuitive Meaning of a State Function 347 

11.7 Commuting Operators 348 

11.8 Uncertainty Relation 348 

11.9 Free Mass Point 350 

11.10 One-Dimensional Barrier Problems 353 

11.11 Simple Harmonic Oscillator 358 

11.12 Rigid Rotator, Eigenvalues and Eigenfunctions of L 2 360 

11.13 Motion in a Central Field 363 

11.14 Symmetrical Top 368 

11.15 General Remarks on Matrix Mechanics 371 

11.16 Simple Harmonic Oscillator by Matrix Methods 372 

11.17 Equivalence of Operator and Matrix Methods 374 

11.18 Variational (Ritz) Method . 377 

11.19 Example: Normal State of the Helium Atom 380 

11.20 The Method of Linear Variation Functions 383 

11.21 Example: The Hydrogen Molecular Ion Problem 385 

11.22 Perturbation Theory 387 

11.23 Example: Non-Degenerate Case. The Stark Effect 391 

11.24 Example: Degenerate Case. The Normal Zeeman 

Effect 392 


CONTENTS 


CHAPTER PAGE 

11.25 General Considerations Regarding Time-Dependent 

States 393 

11.26 The Free Particle; Wave Packets 396 

11.27 Equation of Continuity, Current 399 

11.28 Application of Schrodinger's Time Equation. Simple 

Radiation Theory 41 

11.29 Fundamentals of the Pauli Spin Theory 4< 

1 1 .30 Applications 41 

11.31 Separation of the Coordinates of the Center of Mass 

in the Many-Body Problem 411 

11.32 Independent Systems 414 

11.33 The Exclusion Principle 415 

11.34 Excited States of the Helium Atom 418 

11.35 The Hydrogen Molecule 424 

12 STATISTICAL MECHANICS 

12.1 Permutations and Combinations 431 

12.2 Binomial Coefficients 433 

12.3 Elements of Probability Theory 435 

12.4 Special Distributions 438 

12.5 Gibbsian Ensembles 442 

12.6 Ensembles and Thermodynamics 444 

12.7 Further Considerations Regarding the Canonical En- 

semble 448 

12.8 The Method of Darwin and Fowler 4i 

12.9 Quantum Mechanical Distribution Laws 4 1 

12.10 The Method of Steepest Descents 4£ 

13 NUMERICAL CALCULATIONS 

13. 1 Introduction 467 

13.2 Interpolation for Equal Values of the Argument 467 

13.3 Interpolation for Unequal Values of the Argument . . . 47C 

13.4 Inverse Interpolation 471 

13.5 Two-way Interpolation ; . 471 

13.6 Differentiation Using Interpolation Formula 472 

13.7 Differentiation Using a Polynomial 473 

13.8 Introduction to Numerical Integration 473 

13.9 The Euler-Macl&urin Formula 47 4 

13.10 Gregory's Formula 47C 

1R.11 THa Pnrmnla 4,7f 


CONTENTS 


xiii 


?ER PAGE 

13.15 The Taylor Series Method 483 

13.16 The Method of Picard (Successive Approximations or 

Iteration) 484 

13.17 The Modified Euler Method 485 

13.18 The Runge-Kutta Method 486 

13.19 Continuing the Solution 487 

13.20 Milne’s Method 489 

13.21 Simultaneous Differential Equations of the First Order 489 

13.22 Differential Equations of Second or Higher Order. . . . 490 

13.23 Numerical Solution of Transcendental Equations. . . . 491 

13.24 Simultaneous Equations in Several Unknowns 493 

13.25 Numerical Determination of the Roots of Polynomials 494 

13.26 Numerical Solution of Simultaneous Linear Equations 497 

13.27 Evaluation of Determinants 499 

13.28 Solution of Secular Determinants. . 500 

13.29 Errors 504 

13.30 Principle of Least Squares 506 

13.31 Errors and Residuals 507 

13.32 Measures of Precision 510 

13.33 Precision Measures and Residuals 513 

13.34 Experiments of Unequal Weight 514 

13.35 Probable Error of a Function 515 

13.36 Rejection of Observations 516 

13.37 Empirical Formulas 516 

INEAR INTEGRAL EQUATIONS 

14.1 Definitions and Terminology 520 

14.2 The Liouville-Neumann Series 521 

14.3 Fredholm’s Method of Solution 526 

14.4 The Schmidt-Hilbert Method x of Solution 528 

14.5 Summary of Methods of Solution 532 

14.6 Relation between Differential and Integral Equations . 532 

14.7 Green’s Function 534 

14.8 The Inhomogeneous Sturm-Liouville Equation 538 

14.9 Some Examples of Green’s Function 539 

14.10 Abel’s Integral Equation 541 

14. 1 1 Vibration Problems 542 

ROUP THEORY 

15.1 Definitions 545 

i n e • a c* 


XIV 


CONTENTS 


CHAPTER PAG! 

15.5 Conjugate Subgroups 54$ 

15.6 Isomorphism 54( 

15.7 Representation of Groups 55( 

15.8 Reduction of a Representation 551 

15.9 The Character 55^ 

15.10 The Direct Product. 55< 

15.11 The Cyclic Group 55' 

15. 12 The Symmet&i c Group 55i 

15.13 The Alternating Group 56: 

15.14 The Unitary Group 56! 

T5.15 The Three-Dimensional Rotation Groups 56i 

15.16 The Two-Dimensional Rotation Groups 571 

15.17 The Dihedral Groups 57: 

15.18 The Crystallographic Point Groups 57' 

15. 19 Applications of Group Theory 58 

INDEX 58 


CHAPTER 1 


THE MATHEMATICS OF THERMODYNAMICS 

[ost of the chapters of this book endeavor to treat some single mathe- 
jal method in a systematic manner. The subject of thermodynamics, 

; highly empirical and synoptic in its contents, does not contain a very 
rm method of analysis. Nevertheless, it involves mathematical 
mts of considerable interest, chiefly centered about partial differentia- 

Rather than omit these entirely from consideration, it seemed well 
vote the present chapter to them. Of necessity, the treatment is 
ips less systematic than elsewhere. It is placed at the beginning 
ise most readers are likely to have some familiarity with the subject 
>ecause the mathematical methods are simple. (A reading of the first 
;er is not essential for an understanding of the remainder of the book.) 
1. Introduction. — The science of thermodynamics is concerned with 
[,ws that govern the transformations of energy of one kind into another 
g physical or chemical changes. These changes are assumed to occur 
n a thermodynamic system which is completely isolated from its sur- 
fings. Such a system is described by means of thermodynamic variables 
i are of two kinds. Extensive variables are proportional to the amount 
.tter which is being considered; typical examples are the volume or the 
energy of the system. Variables which are independent of the amount 
atter present, such as pressure or temperature, are called intensive 
bles. 

is found experimentally that it is not possible to change all of these 
bles independently, for if certain ones of them are held constant, the 
ining ones are automatically fixed in value. Mathematically, such a 
don is treated by the method of partial differentiation. Furthermore, 
bain type of differential, called the exact differential and an integral, 
n as the line integral are of great importance in the study of thermo- 
naics. We propose to describe these matters in a general way and to 
r them to a few specific problems. We assume that the reader is 
iar with the general ideas of thermodynamics and refer him to other 
es 1 for a more complete treatment of the physical details. 

k representative set of references on thermodynamics will be found at the end of 
hap ter. Although not easy to read, serious students of the subject should be 
it with the work of J. Willard Gibbs, Transactions of the Conn. Acad., 1875-1878; 
ected Works,” Vol. I, Longmans, Green and Co., New York, 1928; “ A Com- 
,ry on the Scientific Writings of J. W. Gibbs,” 2 vols., Yale University Press, 
daven, 1937. 


1.2 


THE MATHEMATICS OF THERMODYNAMICS 


1.2. Differentiation of Functions of Several Independent Variables.- 
z is a single-valued function of two real, independent variables, x and y 


z is said to be an explicit function of x and y. The relation between 
three variables may be represented by plotting x , y and z along the axes c 
Cartesian coordinate system, the result being a surface. If we wish 
study the motion of some point (x,y) over the surface, there are th 
possible cases: (a) x varies and y remains constant; (b) y varies, x rema 
ing constant; (c) both x and y vary simultaneously. 

In the first and second cases, the path of the point will be along i 
curves produced when planes, parallel to the XZ- or FZ-coordinate plar 
intersect the original surface. If x is increased by the small quantity 
and y remains constant, z changes from f(x y y) to f(x + Ax,y) f and i 
partial derivative of z with respect to a; at the point (x y y) is defined by 


fx{x,y) = lim 

Ax — ►O 


f{x + Ax,y) - f(x,y) 


Ax 


The following alternative notations are often used 


fx(x,y) = z x {x,y) = 



( 1 - 


where the constancy of y is indicated by the subscript. Since both x an 
are completely independent, the partial derivative is evaluated by i 
usual method for the differentiation of a function of a single variable 
being treated as a constant. 

Defining the partial derivative of z with respect to y ( x remaining c< 
stant) in a similar way, we may write 


/,(*,») - ’vfev) - (D_ - (S), <■- 

If z is a function of more than two variables 


z =fix u x 3 , ■ • -,x n ) 

the simple geometric interpretation is lacking, but such a symbol as: 


itives 



(1-3) 


not always true that f zy — f yx ) but the order of differentiation is 
terial if the function and its derivatives are continuous. Since this is 
ly the case in physical applications, quantities such as f xyy f yx or 
xyx,fyxx will be considered identical in the present treatment. 

3. Total Differentials. — In the third case of sec. 1.2, both x and y vary 
taneously or, in geometric language, the point moves along a curve 
nined by the intersection with 2 = f(x y y) of a surface which is neither 
iel with the XZ- nor YZ- coordinate plane. Since x and y are inde- 
nt, both Ax and Ay approach zeroes A z approaches zero. In that 
the change in z caused by increments Ax and Ay, called the total 
zntial of z, is given by 

*-{£),* + {, f), dv <i_4> 


happens that x and y depend on a single independent variable u (it 
b be the arc length of the curve along which the point moves, or the 

y 

2 =ffr,y); * = Fi(u); y = F 2 (u) 

from (4) 

+ (?) T <M> 

du \dx/y du \dy/ x du 

he special case, 


z = f(x y y); x = F(y); y independent 
dy \dx) y dy \dy/ x 


(1-6) 


n important generalization of these results arises when x,y, ■ ■ ■ are not 
>endent variables but are each functions of a finite number of independ- 


1.3 


THE MATHEMATICS OP THERMODYNAMICS 


ent variables, «,»,••• 

/ » f(x,y,z, ■ • •) 
x = Fi(u,v,w, ■ ■ ■) 
y = F 2 (u,v,w, • - •) 


Then, from (4) 


<*/ = 




du + 


W , 




cfo + 


14 , W, 


and from (5) 


/jfv = /dj\ /<te\ 

\du / tt> . . . \dxj Vt z, • • • \du/ 

/3/\ /<?!/\ 

Wx. X. . • . WA. 


+ 


+ 


with similar expressions for (df/dv), (df/dw), • • When these are put 
(7) we obtain 


« - m 

dydu 

J L dx dv dy dv 

J dv + • • • 

fdx 

= — du + 

dx "1 

— * + ••• 

i^ + ris ia+ j'* + ..; 

l% + '" ( 


dv J 

| dx \_du dv 


Since u, v, ■ ■ ■ are independent variables, we may write 


dx dx 

dx - — du + — dv + 
du dv 


dy % , 

dy = — du + — dv + 
du dv 


(i 


Comparing coefficients in (9) and (10), we finally obtain 

df = dx + dy + • • • (1- 

dx dy 

The difference between (7) and (11) should be noted: in the former e< 
tion the partial derivatives are taken with respect to the independent 
riables, while in the latter, with respect to the dependent variables. The 

nortant. ormolnfiion mav thus hft Hrs.wn that tho total HifF^rontial ms,'' 


HIGHER ORDER DIFFERENTIALS 


1.4 


,.4. Higher Order Differentials . — Differentials of the second, third and 
er orders are defined by 


d 2 f = d(df); d B f = d(d 2 f); •••; d n f - d(d n ^f) 
lere are two variables x and y, we obtain from (4) 


* -<«#>- << (D * + (D i( *> + ' j (I) * + (D -«*> 


rever, 


^3xj 


=(g)*+ 5 a 


a 2 / a 3 / 


l a similar expression for d ( —j , hence 


da? 


d 2 f = (d*) 2 .+ 


23 2 / 


3 - =^- dxdy + U (*) 2 + 


dardy 


dy 


df ( 

dy' 


and y are independent variables, d 2 x = d 3 z = • • • cFx = •• • cTy = 0, 
the n-th order differential becomes 


- a? 1 ■ + C) ^5 + ■ ■ ' • + (I) a^? 

d n f 


+ h n 


37 


— : dxdp" 1 4 dp" 

3x3y n-1 3y n * 


( 1 - 12 ) 


re 


the are the binomial coefficients, GK-"*) = n!/k!(n—k)! 


sec. 12.2.) 

Example. Calculate dp and d 2 p for a gas obeying van der Waals’ 
ition : 


P = 


1JT 


F — /9 F 2 

/ 3p\ 22 / 3p\ 22T 

XdT/y " F — /3 ’ \3F/ r = ~ (F - /9) 2 + 

/3 2 p\ / 3 2 p\ 222 2 1 6 a 

\BT 2 ) r ~ ° ; \3F7r “ (F - 0) 3 F 4 


2a 

F 8 


3 / 3o\ 


22 


3 / 3©\ 


— \ ' 

d 2 p = F 2RT 6al 

p iw^w ~~ fK> 2 - 


(5^2^ 


1 v ^ 

l.o. Implicit Functions Tn 

Of one variable on another has been^ven^n the Render 

Let us assume the relation between tL , ® xp,lcl t form, asj = /(■ 
form such as /(*,») == 0 . if j t j s no . . fables to be given in impli 

of^ /<X M ' 0 fcr V aad ‘hen differentiae TO”’ P “‘ e< ' ! ' /<fe ’ on,!MU 

Svr** co “* a “- ^ ^'b^snrjjs, 


* ~ (D. * + (D/* - 


(ia 

W x 


' * 

ff the equations for a circle , 2 

any one may be considered todepend^Tth ^ Variables ' ^(W) = 

possible relations P d ° n the oth er two, for there are thr 

* =f(y,Z); y = 9(*,*); z = h(x,y) 


If * be taken as the dependent variable, then 

dF = Fxdx + F iAy + f^z = o 
At constant y, dy = 0 , so that 


at constant *, dz = 0 , hence 


/ dx\ 

\ dz ) ~ 


(~) 

WA 


(1-15) 


l third possibility arises if two relations are given between three vari- 

3 

= o 

g(x,y,z) = 0 

1 

df = f x dx + fydy + f z dz = 0 


dg = g x dx + g^dy + g 2 dz = 0 


ing these two equations, we obtain (see sec, 10.9) 
dx :dy :dz = 


fyfz 


fzfx 


f* fv . 

Qv Qz 


Qz g% 


9% Qv 


her examples of the properties of implicit functions and their deriva- 
i will be found in the discussion of thermodynamic quantities. 

.6. Implicit Functions in Thermodynamics. — The simplest thermo- 
jxiic systems are homogeneous fluids or solids, subjected to no external 
ses except a constant hydrostatic pressure. Investigation shows that 
,11 such systems, there is an equation of state or characteristic equation of 
'orm 

/(p,7,T) = 0 (1-16) 


e p is the pressure exerted by the system, V is its volume and T, its 
>erature on some suitable scale. From (16), an equation of the form of 
may then be obtained. 

df = (df/dp) v ,T dp + (df/dV) PtT dV + (< Bf/dT) PtV dT - 0 

ng dp, dV, dT equal to zero, successively, there results a set of equations 
ar to (14) and (15) 



(df/dT) P ' V 

1 

(df/dV) PiT 

~ (dT/dV) p 

(df/dp) T ,v 

1 

(df/dT) py 

~ (dp/dT) v 

W/dV) p , T 

1 

(df/ dp) T y 

(dV / dp) T 


(1-17) 


>e possible products may be found by multiplying any pair of these 
tions and removing the common terms. A typical one is 

/ dp\ fdV\ ( dp\ 


* n\ 


1.8 


THE MATHEMATICS OF THERMODYNAMICS 


The product of all three derivatives is 



These results are of considerable importance since they are verified 1 
experiment, the derivatives being proportional to such physical quantiti 
as the coefficients of compressibility, thermal expansion and temperatu: 
increase with pressure. 

1.7. Exact Differentials and Line Integrals. — It is often required, 
thermodynamic problems, to find values of a function u(x,y ) at two poin 
(x h yi) and (x 2 ,y 2 ) by integration of an equation 

du(x,y) = M(x,y)dx + N(x,y)dy (l-2< 


between the limits u\ and u 2 . 

The attempted integration results in such a symbol as 



M (x,y)d 


which is meaningless unless y can be eliminated by a relation, y = f{x 
This is equivalent to specifying the path in the XT-plane along which tl 
integration is performed, hence integrals of (20) are known as line integral 
There are many of these paths, the value of the definite integral differim 
in general, for each. The situation is particularly simple when du is a toU 
differential, or, as it is often called, a complete or exact differential . Con 
parison of (4) with (20) shows that in this case 

M(x,y ) = du/ dx; N(x,y) = du/dy (1-21 

Moreover, since the order of differentiation is of no importance, it follow 
that 

dM/dy = d 2 u/dxdy = dN/dx (1-22 

Inspection of (21) shows that u may be found by integration even when 
functional relation between x and y is unknown. In other words, the lir 
integral is independent of the path; it depends only on the values of x an 
y at the upper and lower limits. The function u is then said to be a poii 
function . 

In thermodynamics, it frequently happens that the upper and low< 
limits are the same, that is, the integration is performed around a complel 
cycle. If the differential du is exact, then the value of the line integral 

TAm * if shi ia movQ in firm awvrmr? a. nlrvaorl mrnlo crixma a ra.oiil*h nr 


2V 


Calculate the change in volume and the work done in going 



Fig. 1-1 


i the initial to the final state, the integration being along two different 
is in each case. Since V = f{p,T), 


dV 



dT + 



dp 


R irn RT J 

= — dT -r dp 

P V 2 


(1-23) 


bhe first equation of path {AC in Fig. 1) be 

/ T2 — T i\ / . A T . 

T — T 1 = ( ) (p ~ Pi) =—(?>- Pi) 

\P2 - Pi/ A P 


AT 

1 dT = — dp and (23) becomes 
Ap 


LAp p \ Ap ) p* Ap p J 

n integration, 

■BC^Pi - P 2 T 1 ) 


F 2 - F, = AF = 


P1P2 


1.8 


THE MATHEMATICS OF THERMODYNAMICS 


10 


The second path will be considered as consisting of two parts : AB and BC 
(cf. Fig. 1). 

Along path AB, T = T\, dT = 0 and along BC, p = p%, dp = 0, hence 


dV = -RTt % + — dT, 
V Pz 


or 


AV = 


R(T, 2P1 — P 2 ^l) 


VlV2 


The change in volume is thus the same for these alternative paths. 

A similar conclusion might have been drawn from the test for exactness: 

M = R/v ; N = -RT/p 2 

dM R_ _ dN 

dp p 2 dT 

which shows that (23) is exact. 

The mechanical work done by an expanding gas is 

dW = pdV 


(1-24) 


regardless of the shape of the container and provided that the expansion is 
performed reversibly 2 in the thermodynamic sense. Combining (24) with 
(23) we obtain 


dW 


- V ^), dT + V ^)r dV 


RT 

- RdT dp 

V 


(1-25) 


It is clear that dW is inexact since 

RT 
V ' 

By path AC, 


M = R; N = 


dM dN 

= 0 ^ — 

dp dT 


R 

V 


1TTT „ f / AT \ dp AT , 

dW = R dT - I Ti - — pi ) — dp 

L \ Ap / p Ap J 


and, on integration, 

H7 T XT . A TT7 


AT \ 
Tp^) 

| dp _ 
V 

AT , ' 
- — dp 
Ap 

(AT 

m 



THE LAWS OE THERMODYNAMICS 


1.9 


>ng paths AB and BC, 


dW = r\ -T!— + dr| 
L v J 


[■ 


V 2 


AW 2 = J2 — Tj In — + AT 
Pi 


] 


mparison of ATFi and AW 2 shows that the work is different along the 
) paths. 

Heat absorbed or evolved in a process, dQ , also depends on the path, 
e expression for the inexact differential with p and T as independent 
iables is 



- C p dT + A P dp (1-26) 

sre C p and A p are the continuous functions of T and p, known as the heat 
•acity at constant pressure and the latent heat of change of pressure, 
Dectively. 


Problem. Connect the points pi,Vi and P 2 W 2 of Fig. 1 with a circular arc. Inte- 
e (23) along this path. 

1.9. The Laws of Thermodynamics. — There are obvious advantages in 
ressing the laws of thermodynamics in terms of quantities which are 
spendent of the path. 3 As we have seen, both dQ and dW are inexact, 
the difference between them, a function known as the internal energy 

dU =dQ~ dW (1-27) 


n exact differential. This equation 4 often serves as a statement of the 
; law of thermodynamics. By combining (25) and (26) we may also 
be 



r a Fl 


r ev~ 

dU = 


dT + 

A P V ;% 

L op J 

additional requirement of exactness from (22) 

d 

r a FI 

d 

r 3Fi 

dp 

Cp P dT _ 

~ dT 

L^-^j 


(1-28) 


(1-29) 


3 This fact was recognized by Clausius, “ The Mechanical Theory of Heat,” trans- 
i by W. R. Browne, Macmillan & Co., London, 1879, who discusses the laws of 
modynamics from this standpoint. 


1.9 


THE MATHEMATICS OF THEHMODYNAMICS 


1! i 


These two equations are a more satisfactory definition of the first law thai 
(27) since they show the essential fact that the internal energy, dU, is ai 
exact differential. The inexactness of dQ and dW is sometimes indicated 
by stating the first law in the form of (27) with symbols such as dQ, DQ 
or 8Q on the right. 

The second law of thermodynamics is based upon an attempt to find i 
function of dQ which is an exact differential. From (27) and (24) 

dQ = dU + dW « dU + pdV (l-27a] 

but U *= f(V,T), hence 



In passing from an initial state, V lf T 1} to a final state, V 2 , To, the integra 
on the right of (30) cannot be evaluated without further information 
since the second term contains both p and V . In the special case of ar 
ideal gas where pV = RT and ( dU)/(dV) T = 0, (30) becomes 



RTdV 

V 


(1-31] 


The first term on the right of this expression is the heat capacity at constant 
volume and depends on the temperature alone. If therefore we make the 
further restriction of constant temperature, that is, assume the process tc 
be isothermal, the integral may be obtained. The form of (31) suggests that 
if we divide by T, the resulting equation 


dQ 

T 


T\dTJv 


dT + 


RdV 

V 


may also be integrated when T changes. The more general inexact differ- 
ential (26) when divided by T is also exact, the quantity S so defined being 
the entropy 


dS 



Cj, 

T 


dT + ^dp 


0 - 62 ) 


The condition for exactness 


arguments concerning the first and second laws are intended only to show 
their property of exactness. The most satisfactory formulation of these 
laws is probably that of Carath6odor}\ We consider this subject in 
sec. 1.15. 

The functions dU and dS may be combined by using (24), (27) and 


(32), to give 

dU = TdS - pdV 

(1-34) 

Since 

/— N 

s 

11 

fc> 

(1-35) 

and dU is exact, 

we may also write 




(1-36) 


Comparison of (34) with (36) shows that 



The importance of (35) arises from the fact that if U is known as a function 
of two independent variables, S and F, it is possible to calculate numerical 
values of p, T and JJ for any thermodynamic state when S and F are given. 
A quantity like U thus furnishes more information than the equation of 
state, for the latter will only give p, F and T; in order to obtain U and S , 
the heat capacity as a function of temperature must also be given. It is not 
necessary to choose S and F as the independent variables in (35) or (36), in 
fact any pair of the set : p, F, T, S (or of the functions to be defined immedi- 
ately) may be taken, but the resulting exact differential is simpler when S 
and F are selected. 

When the conditions of a specific problem suggest another pair of inde- 
pendent variables, it is more convenient to define additional thermodynamic 
functions. These are given in the following relations, where the symbol as 
used by Gibbs precedes the one now customary. 6 

The heat content or enthalpy, x — H ~ U + pF 

dH =dU + pdV + Vdp - TdS + Vdp (1-37) 

The work content or Helmholtz free energy, $ = A = TJ — TS 

dA =dU - TdS - SdT = -SdT - pdV (1-38) 

6 Gibbs preferred S and V as independent variables for reasons given in loc. cit., 
footnote on page 34. 


1.9 


THE MATHEMATICS OF THERMODYNAMICS 


14 


The free energy or Gibbs thermodynamic potential y 
t = F * J7 ~TS + pV 
dF ~dU - TdS - SdT + pdV + Vdp 

- ~SdT+ Vdp (1-39) 


As in the ease of dU , any pair of the set; p, 7, T, S, U } H, A y F may 
be chosen as independent variable, but the exact differential is simpler when 
expressed in terms of the functions shown in the last equation of (37), 
(38) or (39). Since most experimental work is done at constant pressure 
rather that at constant volume, it is obvious that H and F (where the 
pressure is one of the independent variables) are more generally useful 
than U and A. The whole of the thermodynamics of systems of constant 
composition may be developed, however, using any one of the following sets 
of variables: (1) U f S , V ; (2) H, S, p; (3) A y T, 7; (4) F, T, p. 

It is frequently necessary to have some means of predicting the direction 
in which a system spontaneously approaches a state of thermodynamic 
equilibrium. Let us consider two bodies, one at a temperature T x and the 
other at a lower temperature T 2 . Then if the whole system is surrounded 
by adiabatic walls so that no heat enters it, we may write 


dS t - - 


dQ 
T l ! 


dS 2 


dQ 

T 2 


where dQ is the heat absorbed by the colder body. The total entropy of the 
system thus increases, for 

dS - dSt + dS 2 = dQ > 0 

I ll 2 


Clearly dS = 0 when thermal equilibrium is reached. From (39), we also 
see that at constant temperature and pressure, dF = 0 when equilibrium 
is established. Since the entropy reaches a maximum, the free energy 
simultaneously reaches a minimum. In Table 1, we collect the criteria 


TABLE 1. DEPENDENT VARIABLE BECOMES A MINIMUM 

Independent Variables Fixed Dependent Variable 


T, p F 

T, V A 

S,P H 

S,V U 

.0! TJ V 



uaaic>v;uc a 'rr 1 ' vav11, &v|UJLUMi-iUJuu vvuuu vou.jiuuu yt«uk7 u* ujliv jluvavt 

dent variables are held constant. 


Problem a. 
Am. 


Find expressions for S } H, V, A, U in terms of set (4). 

S = - dF/dT ; H = F - TdF/dT ; 

V = dF/dp ; A ~ F ~ pdF/dp ; 

rr „ „ dF a/' 

U = F - T p — 

dT p dp 


Problem b. Verify the following equations which are known as Maxwell's relations: 

G 


'9T\ 

<dvj s 

dT\ 

dp/S 


c 


_ /dp\ fds\ = (dp\ 

\dSJv ’ \dVj T \dT/ 

(*T) . 

\dS/p ’ 


(-) = 
\dp/T 


C 


av\ 


1.10. Systematic Derivation of Partial Thermodynamic Derivatives. — 
;h the addition of Q and W, we have ten important thermodynamic 
ntities. The heat capacities are not included in the list, since by their 
nitions: C p = (i BQ/dT) p , C v = (3Q/dT) v , they may be readily deter- 
ed from the set of ten just mentioned. We now wish to describe meth- 
of obtaining all first order partial derivatives of the form (dx/dy) z where 
f and z are any members of the set. It is immediately apparent that 
re are a large number of them for there are ten ways of choosing x, 
dng nine and eight ways, respectively, of choosing y and z , a total of 
first derivatives. When all possible relations between the first deriva- 
5S are included, the total number of equations is increased enormously 
in general, a selected derivative may be written in terms of three other 
ivatives which are independent of each other as the following considera- 
is show. Suppose x = f(y,w ), then 



dx\ / dx\ / dx\ / dw\ 
&y)z \dy)w + \dw)y \dy) z 


*re are, of course, many cases where there are relations between fewer 
n four derivatives but neglecting these, the total number of equations 
ainable is the number of combinations of 720 derivatives taken four at 
ime, 7201/4! 716! or approximately 10 10 . Although many of the rela- 
is are of little use, it is convenient to devise a systematic method for 
aining any of them. 

The best known of these methods is that of Bridgman 7 which is simple 

7 Bridgman, P. W., “ Condensed Collection of Thermodynamic Formulas/’ 
vard University Press, Cambridge, Mass., 1926. 


1.10 


THE MATHEMATICS OF THERMODYNAMICS 


and often used. It will be described only briefly since it is a sped? 
of a more general procedure which we give in sec. 1.13. It is unnec 
to compute the 10 10 relations because any one of them could be obta 
the 720 first derivatives were tabulated in terms of the same set of 
independent derivatives. The particular choice of the three is arbi 
Bridgman having taken 



because these are directly obtainable by experiment. One could the] 
any four derivatives, write them in terms of the chosen three and elin 
the three derivatives from the four equations. The result would be a 
equation containing the four derivatives. 

The 720 derivatives could then be classified into ten groups by h< 
one quantity constant and varying the other nine. Within the 
containing derivatives at constant z, 




( 


which follows by writing according to (11) 


dx — 


dy = 



setting dz - 0 and dividing one equation by the other. It shoul 
remembered that even if x and y are not functions of w and z it im- 
possible to have inexact differentials of the form of (41), hence the pr 
arguments apply to dQ and dW as well as to the remaining eight the 
dynamic functions. Upon adopting the abbreviations 



= (dx). 


1 X X/UXVX V A 1 1 » iao JU> i iUJJiAXVXJW VX' UXiWXJi./lilW 


lg the ratio of the proper pair, or 

/ dx\ ( dx) z 

\dy)z (dy)z 

task of computing the 72 derivatives in this group is thus reduced to 
ilation of the nine quantities ( dx ) z , ( dy ) z , * • •. The latter are easily 
d-when several of the derivatives (dx/dy) 2 are known in terms of the 
amental three for it proves possible to split the former into numerator 
denominator by inspection. 

f each of the remaining groups were treated in a similar way, 90 expres- 
5 of the form ( dx) Z) ( dy ) 2 , (dx) y , • • • would be obtained but in every 
(dx) v = — ( dy) x so that the final list need contain only 45 relations; 
-are given by Bridgman (loc. cit.) in convenient tables. 8 The follow- 
examples show their use. Let it be required to calculate ( dT/dp)n . 
a the tables, (< dT)s = V — T(dV/dT) p , {dp)n = thus 

t y alternative forms are easily found, for example, 

dT/dS) p = T/C p ; (< dT/dp) s = f (^) ; (as/dph r - ~V/T 

e, 



itional examples, tables for a few of the second derivatives, and exten- 
of the method to include mechanical variables other than pressure 
5 also been given by Bridgman. 

!|l further amplification of the method has been presented by Goranson 9 
se tables include the following cases: (1) one-component unit mass 
ems (constant total mass); (2) one-component variable mass systems 
wo-component unit mass systems; (3) two-component variable mass 
ems or three-component unit mass systems; (4) three-component vari- 
mass systems or four-component unit mass systems. Simplified methods 
constructing such tables have been proposed by several authors. 10 
L.11. Thermodynamic Derivatives by Method of Jacobians. — A more 
iral method which is based on the properties of functional determinants 

For abbreviated tables, see, for example, Slater, “ Introduction to Chemical 
lies,” McGraw-Hill Book Co., New York, 1939; or Glasstone, loc. cit. 

Goranson, Roy W., “ Thermodynamic Relations in Multi-component Systems,” 
oene Institution of Washington. Washington. D. C.. 1930. 


or Jacobians has been described by Shaw. u The mathematical basi, 
which it is founded will be discussed in detail in order to explain the < 
struction of the required table and its application to specific examples. 

1.12. Properties of the Jacobian— The Jacobian of x and y 
respect to two independent variables, u and v, is defined by 

J{x,y/u,v) = d(_x,y)/d{u,v) = 

(*) ft) 

\duj v \dv/ u 

(dy\ f*j\ 

\du) v \dv) u 

When the independent variables are discernible from the context, 
Jacobian may be abbreviated as J (x,y), the second form of (42) be 
reserved for cases where it is necessary to give the independent variat 
explicitly. The following properties are obtained directly from the d< 
nition of the Jacobian: 

J(u,v) = —J(v,u) = 1; 

J(x,x) = 0; J(k,x) =0; k, any constant (1-4 

J(x } y) = J(y,-x ) « J(-y,x ) = 

A further important property of the Jacobian arises if x and y are expli 
functions of z and w, which in turn are explicit functions of u and v. Wr 
ing d(x f y)/d(z,w) and d(z,w)/d (u,v) in determinant form, using the rule f 
the multiplication of determinants, the abbreviations (dx/dz) w « x z ai 
so on, we have 


X Z Xyj 


Z u Zp 


XgZjt XwWft XgZ$ 4* Xy/Wp 


X 


= 


y* Vw 


w u w v 


VzZu + VwW u y& 4- y w w v 


/ dx\ fdy\ /dx\ / $y\ 

~ \du/ 9 \dv/ u \dv/ u \du/ v 


< 1 - 


A typical element of the product 


XgZfl “f“ XyjW^ 


= (—) (2£\ + (—) (^] = (<!A 

\dz) w \du) v \dw)i\du) v \du/ v 


11 Shaw, A N. f PM. Trans. Roy . Soc. (London) A234, 299-328 (1935). 

12 The properties of determinants, which are used here, are discussed in Chapter 1 


In the important special case, y — v, 


for 



(1-45) 


Since many thermodynamic functions are of the form f(x,y,z ) = 0, where 
any one variable is determined by the other two, we may write from (4), 


or using (45) 



dy 


dz = 



3(z,s) 

d(y,x) 


dy 


Expressing each of these variables in terms of two new independent vari- 
ables, rand s, and using the abbreviations J (z,y) = d(z,y)/d(r,s), etc., 
(44) enables us to write 


dz = 


J fry) 


dx + 


J (z,x) 

J (y,x) 


dy 


If we multiply by J ( x,y ), 


J(z,y)dx + J{x,z)dy + J{y,x)dz = 0 


(1-46) 


since J (x,y) = —J(y,x), etc., from (43). If two more variables, u and v, 
are related to r and s in the same way, (46) may be divided by du at con- 
stant v, giving 


(B. + ,H 



+ J (y,x) 




1.13 


THE MATHEMATICS OF THEBMODYNAMICS 


So that finally, again because of (45) 

J (i z,y)J (x,v) + J ( x,z)J (y,v) + J ( y,x)J (z,v) = 0 ( 

Problem* If r, s are functions of x, y, z and the latter in turn are functions 
independent variables u } v show that 

J (r,s/u,v) = J (r,s/x,y)J (x,y/u,v) + J ( r,s/y,z)J ( y,z/u,v ) + J{r,s/z,x)J{z,x/i> 

1.13. Application to Thermodynamics. — This last equation i 
important one which determines all of the thermodynamic partial d 
tives, for if two independent variables, r and s, are chosen which 
pletely determine the others, x, y, z, v, then any one Jacobian, for ex: 
J (x,y), is given in terms of five others. But if r and s are taken fro: 
set x, y, z, v, then J (x,y) is given in terms of only four others, sin 
(47) J(r,s ) = d(r,s)/d(r,s) = 1. 

Let us choose p, V, T and S for x, y, z and v, respectively, so that 
J(T,V)J( P ,S ) + J(p,T)J (V,S) + J(V,p)J(T,S) = 0 ( 

One more reduction is possible since from (34), 

(dU/dV) s = — p; ( dU/dS) r = T 

and 

0 2 U/dSdV) = (dT/dV) s = ~(dp/dS)r 
In Jacobian notation, 

J(T,S)/J(V,S) = -J(p,V)/J(S,V) 

Finally since J(V,S) = — J(S,V ) from (43), we obtain 

J{T,S) = J (p,V) 

When the following abbreviations 

a = J(V,T ) 
b = J(p,V) = J(T,S ) 
c = J ( p,S ) (: 

l = J(P,T) 
n = J(V,S ) 

are substituted into (48) and (43) is used to change the signs, we 

b 2 + ac — nl = 0 (; 


APPLICATION TO THERMODYNAMICS 


LIS 


. The entries for the lower left-hand comer of the table are obtained 
siting the definitions of dU, dH, etc., in Jacobian form. For example, 
e 

dU = TdS - pdV 
J(U,z) = TJ(S,z) - pJ(V y z) 

re z is any required variable. Hence, if z is taken as p and then as V 
J(U,p) = TJ(S,p) - pJ(V y p) = ~~Tc + pb 
J(U,V ) = TJ(S,V) - pJ(V,V ) = -Tn 

last forms following from the part of the table which is already filled 
•om the definitions in (49). The upper right-hand comer may be filled 
he same time, without further calculation, by changing all signs. The 
e is completed by using relations already found, as for example 

J(A,H) = —J(HjA) = - SJ(T,H ) - pJ(V,H) 

- -S(Tb ~ VI) - p(Tn - Vb) 

= -T(Sb + pn) + V(Sl + pb) 

i final result is shown in Table 2. The use of it is typified by the 
)wing examples. 

Example 1. Evaluate ( dF/dT)v in terms of other partial derivatives with 
ad V as independent variables . In Jacobian notation and from Table 2 

/dT) r = J(F,V)/J(T,V) = - Sa+ - Vb = -S - Vb/a 


b/a = J( V ,V)/J(V,T) = -J(p,V)/J(T,V) = ~(dp/dT) v 
ce, 

(dF/dT) v = -S + V(dp/dT) v 

Example 2. Transform the result of the preceding example into deriva- 
i with p and S as independent variables . If the previous result is used, 
term a causes trouble, since with p and S as independent variables, we 
fin a = J(V,T) = d(V,T)/d(p,S), a relation which cannot be reduced 
single derivative. In general, as we have shown, any partial derivative 
r be expressed in terms of not more than three other derivatives of 
modynamic functions. We therefore use (50), which gives a =» 


g'lavi 



b = J(P,V) = d{p,v)/d(s,p) = ~(dv/es) p 

c = J(p„s) = a( P ,S)/d(^p) = -1 

z« J(p,t) = a(p,r)/a(^,p) = ~(dT/dS) p 

n = J(F,5) « a(F,S)/a(5,p) = ~(dV/dp) s 
tee, 

(aF/ar)v -s-v \_ maS)tiav/ ^ )s 1 (ay/ss)*] 

This procedure may be repeated using other quantities, such as T and 
V and p, and so on, as independent variables. The difficulty in choosing 
proper form of the original relation may usually be removed in the 
owing way. Referring to the definitions of a, 6, c, l and n, it is seen 
,t each can be reduced to unity by a proper choice of the independent 
iables. For example, if the latter are chosen as V and T, a - 1, since 
= J(7,T). In the previous case, c - — 1, and it was found advisable 
use some quantity other than a. The situation may be summed up in 
i following directions. In case one of the letters in the top line of the set 
C 1/ 7l\ 

7 equals unity, do not use the one directly beneath it but trans- 

Ot Tl l> J 

m to another by means of (50). In this way, the resulting expression 
1 usually contain only three different partial derivatives. The omission 
b from the above list arises from the fact that even if b = 1, only single 
ivatives will occur. 

Example 3. Solve for ( dp/dT)v in terms of C v , C p and n = (dT/dp) 
Joule-Thomson coefficient. Problems of this sort frequently arise where 
s desired to express a partial thermodynamic derivative in terms of other 
entities, which are measured directly. The usual process of obtaining 
s relationship is tedious and complex. From the table, it is found that 

C v = (BQ/dT) r = Tn/a 

C p = (BQ/dT) p - Tc/l 

P = (dT/dp) H = (Tb - VI) /Tc 

(dp/dT) v = —b/a 

ice there are three relations given and only two letters in the last deriva- 
e, it is convenient to write this in the form 

(dp/dT) r = -b 2 /db 

d to solve for a, b and 6 2 in terms of CV, and p. Using (50) to obtain 


* -V y } 


r ' ' 7 




and finally 

*»■.«. ^Z ) m / Z];l r)/(Cu ‘ +r) 

fp 

than one partial derivative mstead of th mU “ U ‘ SUally c °ntain 
Tft We 2, mstead three as i n the earlier cases. 

(~\ = _ ™ 

a 


If JT , wA"7 -p 

P “ d F " e ‘ ab,n * s Mepenctat variables, 


6 = 1; 


(i) 

(ii) 


a = J(V,T) = _ 

d (p,V) 

o = — — . /££\ _ n 

fi ’ VF/j,- 0 

ILr 0) . /<?tn 

' Uj, 



W/r 


a = 


In Shaw's paper (be cit 1 . - r Wj> ( F ~ ~ P = F 

lations for the following ca^Se^i ^ 1 ^ to si mplifyt 

satiated vapor, black-body radiation der Waals ’ 

denvatives and to apply to ^ eXt / nded by Shaw to includ 
applications, as well as more defaT* ° f 7 ariable composition fi 
paper should be oomSSm d6taiI ° n the of the tables^ the 
Problem. Prove the following re la tions . 

am\ - _ 


(a) ix 


•(£) 

W* 


(b) C v - 


1_ 

"c P 

SdV 


C^tO— 

\dT. 





D 7 (fX 




\dpS T 

of thermodynamS y ^ 1 ^ C ( .^ y !f ems of Varf able Mass —The de 

trr ^ “ a e ar- 


quation could be extended to include systems of variable mass. 14 If 
>nsider a system composed of several substances whose masses are 
i 2 , • • * we may change the internal energy not only by varying the 
py and the volume but also by varying the relative masses. Thus in 
of (35) we have 

U = • -,m n ) 


a place of (36) 


dU -0,^.. dS + (^)s^,. dV 

, t dU \ 1 , (W\ 

+ { ~ 1 dm j + 1 - — ) dm^ + • • • 

(1-51) 

write 


m 

(1-52) 

ive 


dU = TdS — pdV + p\dm\ -f- \i2dm2 *■(*■••• 

(1-53) 


r is eliminated from (53) by using in turn equations (37), (38) and 
we obtain 


Mi 



• * • 


dA\ 


dF_\ 

\dmi) p t T,mutn^ t * • * 


(1-54) 


>artial derivatives defined by any of these equivalent expressions were 
l by Gibbs the chemical 'potentials . We may also convert (53) into 
quation 


dF = — SdT + Vdp + Pidmi + M2^2 + * • • (1-55) 


>nstant temperature and pressure and for a reversible process, as we 
shown, dF = 0; hence according to (55) the condition for equilibrium 


dF = pidmi + ii2dn%2 + • * * =0 (1—56) 

rom this equation we may derive the celebrated phase ride of Gibbs, 
s understand by phase a homogeneous part of a system separated from 
jst of the system by recognizable boundaries. Thus a mixture of ice, 
l water, and steam is a system of three phases. The number of 

His results also included other variables such as electric, magnetic, and gravita- 
fields as well as surface phenomena. 


1.16 


THE MATHEMATICS OF THERMODYNAMICS 


components is the least number of independently variable constituents 
required to express the composition of each phase. In our previous exam- 
ple there is only one component. In a system composed of an aqueous 
solution of sugar there are two components for it is necessary to specify 
the amounts of both water and sugar present. Finally we need a definition 
of degree of freedom . It is the number of variables (such as temperature, 
pressure, composition of the components) which is required to describe 
completely the system at equilibrium. For example, liquid water in the 
presence of water vapor is a system of one degree of freedom, for we may 
vary either the temperature or the pressure but we cannot change both 
simultaneously for then either the liquid or the vapor disappears. 

Suppose a system contains C components and P phases, then an equa- 
tion of the form of (55) will hold for each phase. Since F like S and V is an 
extensive variable, it follows from (55) that the chemical potentials must be 
independent of the masses, so that we may integrate (56) term by term 
obtaining 

F = + ^21712 + • • • + Pc m c (1-57) 

Differentiation of this equation results in 

dF — rndmi + jj,2dni2 + • • • + Pcdmc 
+ midm + m2dn2 + * • • + mcdpc 
When it is subtracted from (56) we get 

midp i + rri2dp2 + • * • + mcdpc 5=8 0 (1-58) 

Equilibrium can be established only when an equation of this form holds 
for each of the P phases. But there are C + 2 variables T, p , p u p*, • • •, 
Pc, hence the number of degrees of freedom / is 

/ - C + 2 - P (1-59) 

This simple equation has been of inestimable value in the study and inter- 
pretation of heterogeneous equilibrium by the chemist, physicist and 
metallurgist. 16 

1 . 15 . The Principle of Carathfiodory. — In most textbooks of thermo- 
dynamics, the order of presentation parallels the historical development 
of the subject. For this reason, considerable attention is paid to several 
kinds of ideal or imaginary machines. The customary procedure is to 
cite, first of all, the impossibility of constructing perpetual motion machines 


bive assertions which are incorporated into the science of thermo- 
imics. The critical student may feel the need of a more logical and 
lal approach, and this will now be given. 

N'e have attempted to emphasize in sec. 1.9 one important mathe- 
ical oonsequence of the laws of thermodynamics, namely, that func- 
3 such as dU and dS are exact differentials. We now wish to discuss a 
3 fundamental mathematical property of these laws which was dis- 
ced by Carath&xlory. His arguments 16 are derived from the geometric 
tvior of a certain differential equation and its solution. As a result, he 
>le to obtain in a purely formal way the laws of thermodynamics with- 
recourse to fictitious machines or such objectionable concepts as the 
of heat. We cannot reproduce here the complete theory 17 but shall 
give the mathematical details of his treatment of the second law. 
jet us assume that a thermodynamic system is composed of n separate 
s, each one of which is characterized by its pressure and volume. Fur- 
, suppose that the whole system is surrounded by adiabatic walls or 
mal insulators while the individual parts of the system are separated 
i each other by walls that are perfect conductors of heat. As a result 
cperiment, it is found that there is no observable change in the system 
equilibrium has been reached) when the following conditions are met: 

MVl'Vl) =/ 2 (p 3 .F 2 ) ~fn(Vn,Vn) = FW (1-60) 

relation Ft) = F(d) for the i-th part of the system is, of course, 
quation of state, and # is the temperature of the whole system on some 
ible empirical scale. According to the first law (see eq. 27a) 

dQ =* dU + pdV » 0 (1-61) 

whole system being adiabatic. Moreover, a similar equation holds for 
l part of the system: 

dQi * dUi + VidVi (1-62) 

dU - £ dUi ; dQ - Z dQ { (1-63) 

i»l *- 1 

ia we have shown, dQi is not an exact differential. However, it de- 
ls on only two variables, and under these conditions an infinite number 

6 Carath4odory, C., Math. Ann. 67, 355 (1909). 

7 Carath4odory’s theory has been reviewed by Born, M., Physik . Z. 22, 218, 249, 
1922) and by Land6, A., “ Handbuch der Physik,” Vol. IX, Chapter 4, J. Springer, 
n, 1926. See also, Buchdahl, H. A., Am. J. Phys . 17, 41, 44, 212 (1949). 


1.15 


THE MATHEMATICS OF THERMODYNAMICS 


28 


of integrating denominators exist. 18 Hence eq. (62) may be converted into 
an exact differential. Let an integrating denominator be U, so that 

dcj>i = dQi/U (1-64) 


is exact. Clearly 4>i is then a function of the state of the system, hence we 
may change (61) in such a manner that the independent variables are $ 
and & instead of U and F. The result of this transformation is 


dQ= pK^ +pi ^) d4,i+ ^ +vi ^) d ^] =o a_65) 


The quantity dQ is not exact, nor is it to be taken for granted that it can 
be made exact by the use of an integrating denominator if dQ contains 
more than two variables. As a matter of fact, the procedure is possible 
only when the differential equation dQ = 0 (known as a Pfaff equation) 
possesses a solution, as we shall show in sec. 2.18. In that case (and we 
shall here be interested in no other), there is an integrating denominator t 
such that 


d(f> = dQ/t 


( 1 - 66 ) 


is exact, even when there are n variables. More important for our present 
needs is the conclusion drawn from simple geometric considerations that if 
there is an integrating denominator, then there are in the neighborhood of 
any point P many other points which are not accessible from P along the 
path dQ = 0. This formal mathematical consequence of the properties 
of the Pfaff equation is known as the principle of Carath6odory. It is 
exactly what we need for thermodynamics. Consider, for example, a gas 
at a given pressure, pi and volume, F x . We may expand or compress this 
gas adiabatically (i.e., along the path dQ = 0), but the final state of the 
system will be characterized by variables p 2 , V 2 which we cannot choose at 
will. There are many values of p and V which we are not able to realize 
adiabatically. 

We refer the reader again to sec. 2.18 for the conditions under which 
equations like (65) have a solution, hence an integrating denominator. 
We proceed here with the physical results which may be obtained when we 
know that the integrating denominator exists. In order to simplify the 
situation let us assume that the thermodynamic system is composed of 
only two parts. This restriction does not mean that there is any loss in 


Vuo j, aiiu j umi 


td<j) — t x dfa + t 2 dfa (1-67) 

take as in (65), fa, fa and d as independent variables we see that 
d<j> t>\ d<i b t>2 d<f> 


dfa t ’ dfa t 


dd 


= 0 


( 1 - 68 ) 


ast equation of (68) shows that <t> depends on fa and fa but not on d, 
it according to the other two equations of (68), the ratios t x /t and 
re also independent of d: 



result may be written: 


I Oh - i 

t\ dd t 2 dd 


1 dt 
t dd 


(1-69) 


'i is a function of the state of the first member of the system and there- 
ould depend only on fa and d , while t 2 could depend only on fa and d. 
ver, (69) indicates that h and t 2 must actually satisfy the following 
ion 


d In t x 
dd 


d In t 2 
dd 


d In t 
dd 




(1-70) 


i g(d) is a function which is common to all systems in thermal contact, 
ependent on any special properties of the substances which compose 
rstem. Integrating (70), we obtain 

In < = J + In A(<t>) (1-71) 

s the integration constant In A depends only on the quantity fa 
that we have dropped the subscripts from t and <j> so that eq. (71) 
to any thermodynamic system and t is the appropriate integrating 
ainator for the particular system under consideration. We see from 
die important fact that this denominator can be separated into two 
one depending only on the empirical temperature d and the other 
)n variables of the state of the system such as <j> whose differential is 


>t us rewrite (71) in the form 


t — Ae 


J* g&m 

*'a 


( 1 - 72 ) 


and define the absolute temperature T by the relation 


T(fi) = Ce 


faWto 


(1 


The constant C relating t? and T may be determined by requiring 
between two fixed points, say the boiling point and freezing point of w 
T shall increase by 100 units. It should be noticed that there is no add 
constant in (73), so that if C is positive, the smallest value of T is zero 
there is no upper limit for T. 

If our thermodynamic system contains only one part, we may use 
(73) and (66) to write 


dQ 


— td<j> = 


TAd4> 

C 


Also, if we put 



f 




+ const. 


c 


w© obtain the well-known expression for the second law of thermodyna 
which defines a change in entropy, dS: 

dQ - TdS (: 

The entropy is immediately seen to be a function of the state of the sy 
constant along an adiabatic path (dQ = 0). It is determined except i 
additive constant. We also note from (76) that the absolute temper 
is an integrating denominator of the inexact differential dQ. 

When the system is made up of two parts which, are in thermal coi 
eqs. (67) and (74) may be combined to give 

Ad<l> = Aid<t>i Hh A%d<t>2 ( 

W© know that A\ is a function of <f> i and that A% is a function of 
want to prove that A is a function of <t> which in turn depends on 4 
$ 2 * Let us assume that A = A (<£). Then 

dA BA &<f> * BA __ BA B<f> 

d<j>i B<f> B<t > i 9 d02 d<f>2 

If we eliminate dA/B<f> from these two equations we obtain 

BA &4> BA B<l> . 

B4 > i d<f > 2 B<f > 2 

This result is often written in the Jacobian notation of sec. 1.12 


31 


THE PBINCIPLE OF CABATH&ODORY 


It tells us 19 that if A is a function of (j> 9 J (A,$) = 0 and converse 
J = 0, then A is a function of <j>. We can easily prove in oui 
that the Jacobian does vanish. Differentiation of (77) results in 




A 2 


dA d<t> t A d 2 <j> 
d<t> i dfo d<f>id(p2 


dA d<j> 
d(p2 d<t > i 


+ A 


a 2 <fo 


= 0 


hence by subtraction we obtain (78). Thus A is a function of <j>. I 
these conditions we have an equation similar to (76) for each part c 
thermodynamic system, and since dQ = 'EdQi, we finally conclude 
(75) and (77) that dS = JEdS t \ 


19 This result which may be applied in the case of n variables is often useful, 
n functions y\ t V% * * * , Un are not independent of each other the Jacobian vanisl 
J = 0, then the n functions are related by some equation f(yi, y 2i ■ • • , y n ) = i 
sec. 3.13. 


REFERENCES 

An excellent summary, including an interesting account of the history of t 
dynamic development, has been given by Partington, J. R., “An Advanced 1 
on Physical Chemistry,” Vol. I, Longmans, Green and Co., New York, 1949. . 

number of literature references, especially to the older source material, is include 
The following list of texts, although far from complete, contains both elen 
and advanced treatments of thermodynamics. 

Glasstone, S., “ Thermodynamics for Chemists,” D. Van Nostrand Co., Inc 
York, 1947. 

Guggenheim, E. A., 11 Thermodynamics — An Advanced Treatise for Chemis 
Physicists,” Intcrscience Publishers, Inc., New York, 1949. 

Klotz, I. M., “ Chemical Thermodynamics,” Prentice-Hall, Inc., New York, 191 
Paul, M. A., “ Principles of Chemical Thermodynamics,” McGraw-Hill Book C< 
New York, 1951. 

Prigogine, I. and DeFay, T., translated by D. H. Everett, “ Treatise on 1 
dynamics. Chemical Thermodynamics,” Vol. I, Longmans, Green and C 
New York, 1954. “ Surface Tension and Adsorption,” Vol. II and “ Irre 

Phenomena,” Vol. Ill, in preparation. 

Rossini, F. D., “ Chemical Thermodynamics,” John Wiley and Sons, Inc., Nei 
1950. 

Steiner, L. E., “ Introduction to Chemical Thermodynamics,” Second Edition, M 


CHAPTER 2 

ORDINARY DIFFERENTIAL EQUATIONS 


2.1. Preliminaries. — The customary classification distinguishes two 
main types: ordinary and partial differential equations. The formei 
contain only one independent variable and, as a consequence, total deriva- 
tives. They represent a relation between the primitive of the dependent 
variable (y), its various derivatives, and functions of the independent 
variable (a). Partial differential equations, whose study will be reserved 
for Chapter 7, contain several independent variables and hence partia 
derivatives. Concerning terminology, the following is to be noted ir 
connection with ordinary differential equations. 

The order of a differential equation is the order of its highest derivative 
its degree is the degree (or power) of the derivative of highest order aftei 
the equation has been rationalized, i.e., after fractional powers of al 
derivatives have been removed. Thus the equation 



is of the second order and the first degree, while 



is of the second order and the second degree. If the dependent variabl 
and all its derivatives occur in the first degree and not multiplying eacl 
other, the equation is said to be linear . The solution of an equation o 
7i-th order involves, in principle, the carrying out of n quadratures or inte 
grations. Since each of them introduces one arbitrary constant, the fina 
expression for the dependent variable will cdntain n arbitrary constants 
However, a solution in which one or more of these constants are givei 


33 


THE VARIABLES ARE SEPARABLE 


2.2 


pendent 1 arbitrary constants; (2) particular solutions, obtainable from 
the general one by fixing one or more of the constants. In addition to 
these, differential equations of degree higher than the first frequently possess 
solutions, known as singular ones, which cannot be formed from the general 
solution in this manner. An example of these will be discussed briefly in 
sec. 2.6; they are rarely of interest in physical or chemical applications. 


FIRST ORDER EQUATIONS 

An equation of the first order can always be solved although the solu- 
tion may sometimes not be expressible in terms of familiar or named 
functions. Methods of solution applicable in the most frequently occurring 
cases will now be given, and the discussion of each method will be followed 
by a list of problems, arising in physics and chemistry, which lead to 
differential equations solvable by the scheme in question. 

2.2. The Variables are Separable. — This is true when the equation, 

(jjy 

which may originally appear in the form fi(x,y) ~ + f 2 (x,y) = 0, is re- 


dx 


ducible to 


f(x)dx + g(y)dy = 0 


Such an equation can be integrated at once and leads to a relation between 
y and x . 


Examples. 

a. Organic growth; radioactive decay . 

Bacterial cultures in an unlimited nutritive medium grow at a time rate 
proportional to the number of bacteria present at any moment. Hence if 
the time t is regarded as independent variable and N, the number of bacteria 
present at time t as dependent variable, 


dN 

dt 


aN 


a being the rate of growth per bacterium. This may be written 


dN 

N 


adt 


1 Arbitrary constants are said to be independent if two or more of them cannot be 
replaced by an equivalent single one. Thus the constants c\ and C2 in the functions: 
ax + ci + C2 and cie x+c * are not independent because these functions may be written 
ax + c and ce x , respectively. 

This distinction is elementary. A more adequate analysis would focus attention 
upon independent solutions of the differential equation rather than independent con- 


to conform to this physical condition. 

Radioactive atoms decay at a rate proportional to the number of 
N, present at any moment, t. Hence dN/dt = —-AJV, which has th< 
tion N = N 0 e~~ u . The disintegration constant X measures the tin: 
of decay per atom. It is a fundamental quantity characteristic c 
radioactive substance. 

b. Flow of water from an orifice . 

A vertical tank of uniform cross-section A is filled with water to an 
height h Q . Water flows out through a hole of area a. It is desired 
the height of the water, h, in the tank as a function of the time, t 
volume flowing out in time dt is avdt , where v is the velocity of the 
at the orifice at time t The loss of height in the tank is dh } hence t 
of volume Adh. Therefore 

avdt = —Adh 

But the velocity is related to the height by Torricelli's formula: v = < 
The empirical constant c would be unity if there were no obstruction 
“vena contracta” near the orifice; for ordinary small holes with 
edges it is 0.6. Thus 

acV2ghdt = —Adh 

or 



On integrating this we have 

Vh-Vh^-l'-SKgt 

2 A 

where the constant of integration has been so adjusted that h = 
t - 0. 

c. Heat flow. 

When heat flows through a body the temperature, T, is in general a < 
cated function of the coordinates within the body. In simple case 
ever, it may depend only on a single coordinate, x (distance from a 
plane, or distance from a point source of heat). In that case, the 
which heat crosses an area A perpendicular to x is given by 

dT 


35 


THE VARIABLES ARE SEPARABLE 


2.2 


and R is constant because of the continuity of flow. The quantity k is 
known as the thermal conductivity. 

(a) If the body is a slab with plane parallel faces, one of which is 
maintained at a temperature Ti, integration of (1) leads to 


T i 



x being the distance from the heated face. From this one obtains the 
elementary relation 


R 



( 2 - 2 ) 


for the heat transfer across a plate of thickness d. 

(P) If a heat source is placed at the center of a sphere, the temperature 
is a function of r alone. Here A = 47rr 2 , and ( 1 ) reads — 4cirkr 2 ( dT /dr) = R } 
which gives 


4l" + C 


In this case, the temperature is not a linear function of the distance from 
the source as it was in (a). 

(7) At constant external temperature the thickness of ice on quiescent 
water increases as the square root of the time. To show this we write (2) 
in the form 


„ dH i a AT 

R == — — = JoA — 
dt x 


where x now represents the thickness of ice and dH the quantity of heat 
transported away from the lower surface of the ice in time dt This, how- 
ever, is proportional to the thickness dx which is added on to the already 
existing layer in time dt. Hence dx/dt = C/x, C representing a constant. 
From this it follows by integration that 

x 2 ~ t 


d. Salt dissolving in water . 

When xq grams of salt are placed in M grams of water at time t = 0, how 
many grams will remain undissolved at time £? The rate of solution, 
dx/dt , is proportional, (a) to the number of grams, x, undissolved at time t, 
(b) to the difference between the saturation concentration, X/M, and the 
actual concentration, (x Q — x)/M. (X is the number of grams of salt 
that would produce saturation.) Thus 



To solve, we write 
dx 

~ (X - Xq)x + z 2 “ ~~Y~ 
Integration then leads to : In 


1 (dx dx \ _ 

~ Xq \ X X ~ Xq + xj 


X — Xq + x 


kt. When 


the constant c is adjusted so that x = x 0 at t = 0, the result is 
, (X — *o + s)x 0 X - x 0 . , 


If = X, then the solution is = (k/M)t, as one may easily verify 

X Xq 

by going back to equation (3). 

e. Atmospheric pressure at any height. 

The increment of pressure between two points in the atmosphere differing 
in height by dh is dP = — pgdh, if p is the density at height h. But p is 
related to P by the expression Pp~ 7 = Pqpo -7 , which is valid for adiabatic 
expansion of air if 7 is taken to be 1.4. 2 The quantities P 0 and po are the 
sea level values of P and p. Therefore 

/P\ l/y 

dP = — j pofird/i 

and this, on integration, gives 7 = 1 — — the constant 

\Po/ 7 Po 

of integration being adjusted so that P = P 0 at fe = 0. 


f. Homogeneous gas reactions. 

Chemical reactions involving but a single phase are said to be homogeneous. 
Among these there may be distinguished unimolecular, bimolecular, ter- 
molecular reactions and so on. In the unim olecular case, the number of 
molecules undergoing a chemical change is at any instant proportional to 
the number of molecules present. The decomposition of nitrogen pentox- 
ide into oxygen and nitrogen tetroxide (2N 2 0 5 — > 0 2 + N 2 0 4 ) is an exam- 
ple of this kind, the differential equation being similar to that describing 
radioactive decay (Example a). 

In a bimolecular reaction, of which there are numerous examples, sub- 
stances A and B form molecules of type C. If a and b are the original 
concentrations of A and B respectively, and x is the concentration of C at a 
given instant, then 

f -■*(« -*)<&-*) 


® ia fkft T*dhir» ftf tkl> k oa f of. on-ncfonf -rvroeeirm fn fkof 


37 


THE VARIABLES ARE SEPARABLE 


2.2 


To integrate this equation, the expression 


(a — x) (b — x) 

the partial fractions — —r \ : — - — . We then have 

A a — 6 Lo — x a — xj 

l r / dx dx \ __ r 

a — b J. \b — x a — x/ J 


is resolved into 


fett 


whence 


1 i CL x 

m ; = kt + c 


a — b b — x 


1 a 

Since x==0at£ = O, c = In - , so that 

a — b b 

b(a — x) = 

a(6 — x) 

From this, the reaction rate is seen to be 


k = 


In 


6(a — x) 


2(a — fc) a(6 — x) 
The concentration of substance (7 is 


a(l 


3 (a- 


i— 


When the original concentrations a and b are equal, the expression for k 
becomes indeterminate* but on putting b = a + e and letting e approach 
zero, an expansion of xhe logarithm yields 

1 x 


k - 


at a 


which is also seen to be a solution of the differential equation 

I = fc(a - * )2 

Other types of reactions will be dealt with in the problems on p. 40. As to 
terminology, we note that a rate law for multimolecular reactions of the 
form 

"dor 

= k(a 1 - x)^(a 2 - x) n ’ • • • (a, - x) w * 
dt 

is often said to describe a reaction of the n-th order, where 



Any phase change of a substance which takes place at constant pressure and 
temperature conforms to Clapeyron’s equation: 

dP l 

dT “ T(V f - Vi) 

Here l represents the latent heat of the process, V / and Vi the volume per 
mole of the final and the initial phase respectively, and P the pressure. 
This equation may be applied to the process of sublimation, yielding an 
approximate expression for the vapor pressure as a function of the tempera- 
ture. In that case l, the latent heat of sublimation of the solid, is nearly 
constant over a range of temperatures, and Vi, the volume of the solid, 
may be neglected in comparison with that of the vapor, V f. The vapor, 
though not a perfect gas, will be taken to satisfy Vj = RT/P. Clapeyron’s 
equation then becomes 

dP _ JP_ ^ 

dT ” RT 2 

which on integration gives 

P - ce^ BT 

an equation often called the Clausius-Clapeyron equation. This result is 
found to be valid over small ranges of temperature, for the vapor pressure of 
both solids and liquids. A more refined result may be obtained by intro- 
ducing for l a more adequate approximation. 

h. Centrifuge problem. 

When a cylinder of height h, filled with fluid, is rotating about its axis, the 
pressure within the fluid will not be constant but will depend on r. Con- 
sider a cylindrical shell of fluid of thickness dr, the surfaces of which are 
coaxial with the rotating vessel. The net force pushing inward on this 
shell is 2rrhdP . This must equal the centripetal force due to the angular 
speed 03, namely mo) 2 r, iwhere m, the mass of the fluid, is given by 2irrhdr - p. 
Hence 

2 vrhdP = 2 wrhpdr • a> 2 r 

(a) If the fluid is a liquid, the density, p, is constant and the solution is 

P - f pcA 2 + Po 

(0) If the fluid is a gas, P = cp (since PV = const.), the solution is 
then 

P - Po^ 2 * 



39 


THE VARIABLES ARE SEPARABLE 


2,2 


i. Soap film. 

If a soap film is stretched between two circular wires, both having their 
planes perpendicular to the line joining their centers, it will form a figure 
of revolution about that line. At every point such as P (cf. Fig. 1) the 
horizontal force acting around a vertical section of the film is the same. 
Hence 

2tti/T cos 6 = const. 



Fig. 2-1 


where T is the surface tension of the film. But 


so that 


-[-or 

•['•or- 


T being a constant. 


which leads to 


Solving for the derivative, 

dy _ /V s _ y /a 
dx - Vc 2 7 


^X + Cl 

y = c cosh 


The constants c and c\ may be expressed in terms of the distance between 
the wires and their radius. The longitudinal section of the film is seen to 
be a catenary. 

The examples above seem sufficient to illustrate the method under dis- 
cussion. The problems leading to separable first order equations are very 
numerous. 


Problems. 

a. Hdmholtz ’ equation. 

If a circuit has resistance R and inductance L, the current I in it obeys the differential 
equation 



2.2 


ORDINARY DIFFERENTIAL EQUATIONS 


40 


where E is the impressed or external electromotive force. Show that the growth of a 
current ( [E = const., I = 0 at t = 0 ) is described by 

I = | (1 - e~W>‘) 

R 

and the decay (E « 0, / * Jo at * = 0) by 

/ « Jfoe-WW* 

b. Solve the equation for termolecidar reactions: 
dx 


dt 


— A:(a — x)(b ~ x)(c — x). 


0 - D O - tT 0 - D - 

c. Solve the equation for opposing unirrwlecvlar and bimolecular reactions: 
dx 


dt 


= ki(a — x) — fcax 2 


under the condition x ~ 0 at £ = 0. 

Atur. - = ~^A coth Ak2t + \ where A 2 =~(a + 77^) 
x fc2\ 4 jfc 2 / 

Show that, when equilibrium is established (£ — 

x 2 ki 

a — x kz 

d. Solve the equation for consecutive unimolecular reactions of the type 

k\ ki 
A-+B-+C 

that is, 

dn\ dnt 

— = — fcmi, -77 = Jkmi - k 2 n 2 
at at 

Ana. n 8 = (n t + n 2 + n 3 ) |l - - e - * 14 + 

l fcj — fci its — fci J 

where »a = amount of C present at £. 

e. A projectile is fired vertically into the air with initial velocity V. (1) Find its 
speed at any height; (2) find the time at which it will have traversed a distance r. 
Note: the differential equation to be solved is 

dv dv gR 2 


EXACT DIFFERENTIAL EQUATIONS 


2.3 


(2) If F 2 > 2 gR 

(F 2 - 2gR)~ l j^F 2 -2 gR + - vjr 

*«■ [, (v-VS+^ + fT*- , /]} 

(F 2 - 2?fl) 1 ' 2 “ F + (F 2 — 2gR) 112 + 1 111 R J j 

2.3. The Differential Equation is, or Can be Made, Exact. Linear 
ations. — A differential equation, written in the form 

Adx + Bdy = 0 (2-4) 

>re A and B are functions of x and y, is said to be exact if the left-hand 
! is an exact differential. The necessary and sufficient condition for this 
>e true was shown in sec. 1.7 to be equivalent to the Cauchy relations 

d_A _ bB 
by dx 

) equations considered in the foregoing section, where A was a function 
• alone and B a function of y alone, are exact in the trivial sense that 
fby = dB/dx = 0. 

Differential equations occurring in practice are rarely exact, but every 
ation of the form (4) can be made exact and then integrated. The 
ice for doing this is to multiply it by a suitable factor known as the 
grating factor . For instance, the equation 



ot exact. It becomes exact on multiplication by xy. For it then takes 
form 



ch has the solution: 



const. 


While an integrating factor exists for every equation of the form (4), it 
ot always easy to find. If the equation is linear , however, that is if it 
be written 


— 4 - = nfx} 


(% 4 *\ 


2.3 


ORDINARY DIFFERENTIAL EQUATIONS 


42 


this factor eq. (5) becomes 


4- (W) = 


where the abbreviation F(x) = f f(Z)d£ has been used. The solution is, 
clearly, 


V 


—F 


£ J e F gdx + cj 


(2-6) 


This result is most useful, for the occurrence of linear equations is very 
frequent. 


Examples. 

a. Circuit containing inductance and resistance (Helmholtz' equation). 

This problem has already been discussed, but it may be instructive to solve 
the differential equation also by the method of eq. (6). We have 


Thus 


dl RI _ E 
dt + L ~ L 

t R * jp R E 

f = L and F = L t ’ g = L 


(2-7) 


:e -vnL)t \ 




and this agrees with our previous result (Problem a). 


b. Circuit with inductance and resistance; variable electromotive force . 

The present method involves the solution of eq. (7) when E is a function of 
the time, in which case the equation can no longer be separated. Let us 
assume that 


E = E 0 sin cd 



EXACT DIFFERENTIAL EQUATIONS 


2.3 


43 

Hence 8 


1 - 


e -(RIL)t 


E 0 J e mL)t sin utdt + cc~ WL)t 


Eq 


L o>' 2 + 


,2 — — 2 S ^ n tti — U) COS O?0 + ce (i2/L) * 


where g/ has been written for 2?/L, a quantity having the dimensions of a 
frequency. To fix the constant we assume that I (0) = 0, in which case 


Eq 1 
L a/ 2 + <^ 2 


(a/ sin ut — o) cos cot + coe <a t ) 


The last term represents transient currents which disappear as soon as 



co 


c. Radioactive decay of mother and daughter substances. 

Let A be the number of atoms of the mother substance (e.g., UI) and B 
the number of atoms of the daughter substance (e.g., UXi) at time t y Ao 
being the original value of A at t = 0. Let and \ B be the decay con- 
stants as defined in sec. 2.2a. The two substances satisfy the two differ- 


3 Here and elsewhere, there occurs the integral j* e"'*sin cotdt. This is easily 
evaluated if the sine is written as an exponential: 


sin x = (e ia 
2 i 


e-™). 


Thus 


J sin atdt =J i J ' - e (u '~ M, ]dt 


r e (a>'+iw)f 

e u>t 

[(«' - iw)e<"‘ - («' + ia)e~ iut 

( a/ 4 "^ a/ — ico\ 

’ = * 2 ST 

{ a ' 2 + a , 2 ] 




/2 


0 53 tan -1 


r 

O) 


(a/ sin cot — co COS cot) = — 


0<o f t 


(a ' 2 + a. 2 ) 1 '* 


COS (at + 0 ) 


CO 



cos cotdt 


e u '* , 

“75 5 (a> COS cot + co sin cot) 

co i 4“ w 


Similarly: 


2.4 


ORDINARY DIFFERENTIAL EQUATIONS 


ential equations 

dA dB i, . 

a =_x ^ ; 7t = + 

When the solution of the first, A = is substituted in 

there results 


— 4* X#jB = X^Aq^ 
at 


an equation which is linear in B and can be solved by formuL 
solution is: 


B = e~^ 1 J \ A A Q e^-^dt + cj 




if we assume that jB( 0) = 0. Note that B will reach a maxii 
t _ In X^ — In X B 
Xa *■“ X B 

Problem. A circuit contains capacitance C, resistance R, and is 
electromotive force E. Calculate the instantaneous value of the electric 
condenser, noting that it satisfies the differential equation 


Ana. Por E = Eo sin 


5 


E* 1 f , 


sin c»>e — « €506 cat 4 we”**'*), 



2.4. Equations Reducible to Linear Form. — Of some 

interest is an equation of the form 

^ +f{x)y = e(i)y* 

because It can be matte linear by fee substitution f/ « t^ l ~ 
vert® (8) into 


HOMOGENEOUS DIFFERENTIAL EQUATIONS 


2.5 


45 


2.5. Homogeneous Differential Equations. — A first order equation is 
said to be homogeneous 4 if, the equation being written in the form 

Adx + Bdy = 0 

A and B are homogeneous functions of the same degree, i.e., 

A (toyty) = l*A(x,y); B(tx,ty) = t a B(x,y) 

If this is true we can substitute y = vx, obtaining 

A(x,y) = A(x,v x) =x a A( l,v); B(x,y) = x a B (l,t>) 

The original equation, 

dy = _ A 
dx B 

is converted into 

dv A{ l,v) 

by this substitution, and this equation is separable, yielding 

dv dx 

f(v) — v x 


Example. Lines of force. 

An equation closely related to the homogeneous type, and tractable by the 

4 A remark on the use of the word “ homogeneous ” in mathematics seems in order, 
for the term is used with several different meanings in different contexts. The following 
definitions correspond to the chief usages. 

1. Homogeneous function: f(x i,x 2 ,- • -x n ) is said to be homogeneous in all its vari- 
ables if, for any parameter, t , f{tz\,tx%- « 4x n ) = t a f (x h X 2 ,- • -x„). a is the u degree ” of 
the homogeneous function. 

2. Homogeneous equations: A set of simultaneous linear algebraic equations of the 
form 

n 

2 QjiXi = Cjy j = 1 , 2 , • • *, n 

t=l 

in which the a 7 s are constants is said to be homogeneous if all c’s are zero. 

3. Homogeneous differential equations: (Two usages of the term!) 

a. A first order equation of the form Adx + Bdy = 0 is said to be homogeneous if 
A (x,y) and B (x,y) are homogeneous functions of the same degree. 

b. In general, P{x,y,y ,y " • •) = 0 is said to be homogeneous if F is a homogeneous 
function of y and all its derivatives, not necessarily of x. Thus 

fn(x) +/n - i(j:) ^=i + • • • /i(*) • v - o 


2.5 


ORDINARY DIFFERENTIAL EQUATIONS 


substitution here described, is the differential equation for lines of i 
A line of force is defined as that curve which is tangent, at every 
through which it passes, to the force at that point. The present an: 
is applicable to attracting mass points, attracting or repelling eL 



charges, and magnetic poles. Let it be desired, for example, to fir 
lines of force due to two charges, q x and q 2 , a distance 2 a apart. 
Fig. 2. ) If we restrict our consideration to the plane containing the cl 
an d the point P, then, for every point in this plane, the definition of 
of force requires that 


dy 

dx 


Is 

F~ 


3 (y + ®) + H (y - o) 

r 2 

h . 32 


If a were zero, this would reduce to dy/dx = y/x, an equation whi 
for its solution all straight lines through the or igin These as i 
known, represent the lines of force due to a point charge. In g< 
however, eq. (9) reads 

^5 Ixdv — (v -I - a)dx 1 + \xdn — (u — nWl _ n 


47 


NOTE ON SINGULAR SOLUTIONS 


2.6 


variables, y\ = y + a and y 2 = y — a, so that dyi = dy 2 = dy; ri = 
(x 2 + yi) 1/2 , r 2 = (x 2 + jJ) 1/2 , eq. (9a) takes the form 

xdyi — y x dx ( xdj/ 2 — 2/ 2 ^ 

9 l (x 2 + ^ 2 + ? 2 (x 2 + ^ =0 

each part of which is homogeneous. Now put y x = v x x y y 2 = t; 2 rc so that 

= xdy — 

The result is then simply 

„ , dv 2 

qi (1 + v \ f ' 2 + 92 (1 + i|) 3/2 - 0 


When this is integrated, we immediately obtain the equation of the lines 
of force due to the two charges : 


»i v 2 

qi (1 + i*) 1 ' 2 + q2 (1 + vl ) 1 * 2 


Ml 

n 


<j2V2 

r 2 


= const. 


2.6. Note on Singular Solutions. Clairaut’s Equation. — A first order 
equation of degree higher than the first may have a special kind of solution 
which is not obtainable by specifying the constants in its general solution. 
Thus consider 


y 


dy /dy \ 2 

X dx \dx) 


(2-10) 


This equation may be solved by the following artifice. Differentiate once 
more, thus converting it into a second order equation, which, however, can 
easily be handled by the methods already discussed. The result is 


or 


d|/ = dy d2y o dy d 2y 

dx dx dx 2 dx dx 2 



( 2 - 11 ) 


If now the first factor be cancelled, the equation is 



and has the solution y = c\x + c 2 . This, however, is too general a result 
since it contains two constants of integration, a circumstance brought 


2.7 


ORDINARY DIFFERENTIAL EQUATIONS 


48 


necessary to substitute this solution and adjust c 2 in conformity with its 
demands. It is then seen that c 2 = cf, and 

y = cx + c 2 

is the general solution of eq. (10). 

But eq. (11) can also be satisfied by equating the first factor on the left 
to zero. This leads to 

dy x 2 

z + 2--°, or V--J + ' 

This will satisfy eq. (10) if c = 0. Thus 


x 2 



is another solution of the original differential equation, but one which is not 
derivable from its complete solution. It is called a singular solution. 
Inspection will show that it represents the envelope of all the straight lines 
which correspond to the complete solution. This is generally the meaning 
of singular solutions. 

An equation of the form 

dy r(dy\ 

y ~ x n +i \di) 

is known to mathematicians as Clairaut’s equation. Eq. (10) is a specimen 
of this type. Clairaut’s equation can always be handled by the method 
here used and has the general solution 

y = cx +f(c) 

EQUATIONS OF HIGHER ORDER 

A general method for solving certain differential equations of higher 
order will be presented in secs. 2.10-12. It seems appropriate, however, to 
discuss first a few special types of differential equations which can be solved 
by elementary means. While the theory given in this section is applicable 
to equations of any order, emphasis will be placed solely on second order 
equations because of their prominence in mathematical physics. 

2.7. Linear Equations with Constant Coefficients; Right-Hand Mem- 
ber Zero. — In discussing this type of equation it becomes convenient to 
introduce a new notation; we write D = d/dx. A symbol such as D, 


gebra quite different in many respects from ordinary algebra. For the 
'esent we merely observe that a differential equation of the type under 
scussion in its most general form may be written: 

D n y + a x D n ~ l y + a 2 D n ~ 2 y + • • • a n y = 0 (2-12) 

he a’s are constants; the order of the equation is n. Consider now the 
fferential equation 

(D - r x )(D - r 2 ) •••(£>- r n )y = 0 (2-13) 

tiich must be understood to mean that the successive application of 
'dx — r n , d/dx — r n _ Xj etc., upon y is to yield zero, the r ; s being constants, 
is clear that (12) and (13) become identical when the r’ s are chosen to be 
e roots of the algebraic equation 

r n + a 1 r n '~ 1 + a 2 r n ~ 2 H + a n = 0 (2-14) 

*t us then attempt to solve (13). A particular solution of that equation is 
sily found, for if y satisfies 

(D - r n )y = 0 

will also satisfy (13), since further differentiations and multiplications by 
will leave the right-hand side unchanged. But (D — r n )y = 0 has the 
lution y = c n e TnX y hence this is a particular solution of (13). 

Furthermore, we observe that the order of the “ factors ” (D — u) 
►peaking in (13) is insignificant. Hence any factor may be written last, 
d this means that c n _ ie rn ~ ix is also a particular solution, and soon. On 
‘ding all particular solutions, i.e., on putting 

y = X c ^ iX ( 2 " 15 ) 

» 

ere results a solution with n independent arbitrary constants, and this 
list therefore be the complete solution. To summarize: in order to solve 
2), first determine the roots of (13), which is known as the auxiliary 
nation. If these roots are denoted by n, the general solution is (15). 

One point is to be noted. If the coefficients a appearing in (12) are 
notions of x , the decomposition into factors leading to (13) cannot be 
side by solving the auxiliary equation. The reason is that then the r's 
11 also be functions of x } and 

(D - r t )(D - r 2 )y J* (D - r 2 )(D - r x )y 

the reader may easily verify. This state of affairs is expressed succinctly 
r saying that the operators (D — r x ) and (D — r 2 ) are commutative only 


2.7 


ORDINARY DIFFERENTIAL EQUATIONS 


if the r’s are constants. For variable r’s the order of th 
is also essential, so that the whole method of solution her 
fail. 

Returning to the case of constant coefficients, one mine 
be considered. Suppose that two roots of the auxiliary eq 
If they are called ri the supposedly general solution will 
(c\ + c 2 )e nx which is equivalent to ce nx . One arbitrary c 
lost and the solution obtained is no longer complete, 
fault we consider the two factors of (13) which gave rise to 
equation 

(D - r x ) 2 y = 0 


One solution is certainly y = i xX . Let us look for a gener; 
form y = f(x)e riX , On substitution of this into (16) 1 
following differential equation for f(x): 


a 

dx 2 


- 0 


Hence/ = c x x + C 2 , and the complete solution of (16) rea 

y = (ctx + c 2 )e rix 

This shows that, when two roots of the auxiliary equate 
have the value r 1} the part of the solution (c x + c 2 )e nx < 
must be replaced by (c x x + c 2 )e r ' x . An extension of thi 
to the general result: If r* is a p-fold root of the auxilia 
complete solution of (12) is 

y.=* c x e nx + + * * * Ci{ 1 + a x x + a 2 x 2 + b 

Examples. 

a. Simple harmonic motion . 

When the force on a particle of mass m moving along the 
—fey, Newton’s second law of motion reads: 

m % - -*» 

Here fc, the force per unit of displacement of the particle 
stiffness of the oscillator. If we denote the positive con 
the equation becomes d 2 v/dt 2 4- = 0. The roots 


in sines and cosines we obtain 

/ = (ci + C 2 ) cos cot + (ci — c 2 )i sin cot = C\ cos cot + C 2 sin cot 
last result may also be stated as follows: 

y = A sin (cot + 5) = A / cos (cot + 5 r ) 

e the new constants A, 8 y and A*, 8 f are ^elated to Ci and C 2 
sin 8 = Ci, A cos 8 — C 2 ; A f cos 8 f — Ci, —A' sin 8 f = C 2 , or con- 
ly a 2 = A' 2 = c\ + Cl 8 = tan*"" 1 C t /C 2 , «' = tan"” 1 C 2 /C t . 

, Chain sliding over a smooth peg. 
chain (cf. Fig. 3) is sliding over the peg, the 
end moving downward. Let the displacement 
is end from 0, the point it would occupy in equi- 
na, be y. If the linear density of the chain is X, 
its total length l, the mass to be accelerated 
. The resultant force is 2\yg. Hence, from 
xm’s second law, 

d 2 y d 2 y 

1X^5 -2X,„, or - s - T y-0 



l 



auxiliary equation has the roots dbV 2 g/l, leading 
re general solution y — cie^ 11 + C 2 e v ^’ Lt m 
constants may be fixed by supposing that, when t — Vo and 


Fig. 2—3 


= 0. Then ci + c 2 
Vo 


Vo J ci - c 2 = 0; and. 


A? 


y = y? + e -V5S7r t) = yo cosh 






U r cf 
l r , 3o t tit 




Damped simple harmonic motion . 

1 the motion of the oscillator considered in example (a^4»td^]P^d> 
is present, besides the restoring force — ky, a damping force prGp&y^ 
1 (at small velocities) to — l(dy/dt ), the negative sign indicating that 
orce retards the motion; l is known as the damping constant. The 
■ential equation describing the motion is 




(2-17) 


3 written for the constant quantity l /2m. The auxiliary equation has 
oots 


-l ± VV - co 2 so that the general solution becomes 

y 


c ^- b+ vwrj )t + C2e (- b -v»=zt t 


2.7 


ORDINARY DIFFERENTIAL EQUATIONS 


To adjust the constants in conformity with physical conditions we 
that, at t = 0, y = y 0 and dy/dt = 0. Then with the use of the a 
tion R = vV — co 2 

Several special cases are of interest in this connection. 

(a) b > co. R is then real, but smaller than b. Hence both 
(18) represent an exponential decrease. The motion is not oscillat 
(P) b = co. Then R = 0, and y = yoe~~ bt ( 1 + &$)• The moti< 
oscillatory; it is said to be critically damped. 

(y) b < co. Then R is imaginary and may be written 
co /2 = co 2 — 6 2 . Eq. (18) now reads 

V = ^cos co't + sin a 

or, in equivalent form, 

y =~} y 0 e- bt sin (co'* + 5) 

co 

where 8 = tan"" 1 co' /b. This represents a damped sinusoidal ir 
period T = 2t/\^o 2 — b 2 ; the amplitude decreases exponential!} 


d. Natural oscillations in an electrical circuit 
In a circuit containing R, L, and C, the sum of the “ partial ” electi 
forces due to inductance, resistance and capacitance equals the 
e.m.f. If the latter is zero (natural oscillations) we have 



+ «/+§ 


= 0 


or, remembering that I = dq/dt, 


d 2 q Rdq, 1 
dt 2 + Ldt + LC q 


= 0 


This equation is of the form (17); the constants are b ■■ 
w = (LC)~ 1/2 . The solutions are already given in the foregoing < 
In particular . if oscillations are to take place, co > 6. i.e.. 2\ / J7/ 


LINEAR EQUATIONS WITH CONSTANT COEFFICIENTS 


2.8 


8 ‘ t8n "' n /<§* ~ 1 


he initial conditions here are that at t =0, the condenser has a charge 
and there is no current. 

2.8. Linear Equations with Constant Coefficients; Right-Hand Mem- 
5r a Function of Jt. — We now restrict our considerations to differential 
luations of the second order. In terms of the notation of the foregoing 
ction, the problem is to solve 

(Z) 2 -I- a x D + a 2 )y = f(x) (2-19) 

the roots of the auxiliary equation are rj and r 2 , this equation takes the 


(D - n)(D - r 2 )y - f(x) 


( 2 - 20 ) 


ut ( D — r 2 ) y s u, so that ( D — r x )u = f(x). This is a linear first order 
juation which can be solved by the method of sec. 3. It gives 


= e rix J e rix f(x)dx + Cie ri * = e rix (<p(x) + Ci) 


J ftxl 

e _nf /(£)$; = <p(x). If this is substituted back into the 
o 


efinition of u, the result is (D - r 2 )y = e rix (<p(x) + a), an equation 
hich may again be treated in accordance with formula (6). Hence 


y = e r * x J e (ri— fj)x [*j(x) + Ci\dx + c 2 e riX 

= e riX f e (Ti ~ n)x <p(x)dx H — e rix + c 2 e TlX 

J — r 2 

n changing the meaning of the constant we write the solution of (19) 
y = e riX J e {ri ' r2)x cp(x)dx + c t e rix -f c 2 e riX (2-21) 


'<p(x)dx + c\e nx -f c 2 e ri 


(2-21) 


The form of this solution is interesting. The last two terms are identi- 
il with the solution of the homogeneous equation. They are called the 


omplementary function , while the remainder, e T2X J * 1 ^ <p(x)dx, is 


nnienvnl Tims the “ inhomoereneitv ” of the 


2.8 


ORDINARY DIFFERENTIAL EQUATION 


by inspection, that is, by selecting any function which \ 
tion. When this is available one can make use of the 
form the complete solution by adding to this function 
of the homogeneous equation. Usually, however, the 
culation of the particular integral is hardly more diffic 
The particular integral can be written in a form 1 
convenient in practice. On performing a partial integi 


J e<i-r *>V(aOdx 


_ r e (ri ~ Ti)x tip 
n — r 2 J ri — r 2 dx 


e (ri— 7 * 2 )x 

n - r 2 


J e~ TlX f(x)dx - J 


e~ r2X 
n ~ r 2 


because d<p/dx = e^ lX f(x). The particular integral tl 


and finally 




y = r \ r e rix f e~ rix f(x)dx - e™ J e~ Tlx f(x)dx J 


+ 


Examples. 

a. Forced oscillations of a mechanical or electrical syi 
The equation to be considered is (17) but with a fun 
zero on the right. In most applications this function, ^ 
impressed force divided by the mass of the oscillating sy 
teal case, is a sinusoidal function of the time. Hence 
the differential equation 

^l + 26 f + " 2j/=/oSin “ < 

As in sec. 7, example (c), the auxiliary equation has tl 
n = + Vb 2 — co 2 , r 2 = — b— Vtf 


If again we denote Vb 2 — a> 2 by R , the particular 


55 


LINE AH EQUATIONS WITH CONSTANT COEFFICIENTS 


2.8 


When this is done and the terms are suitably collected, 

x> t _ IF b — R b + R 1 , 

‘ ' “ 2R{l(b - R) 2 + a 2 (b + R)* + a 2] Smat 

[(6 - R) 2 + * 2 ~ (b + R? + J COS at \ 

= , — 7 o r ? { (to 2 — a 2 ) sin at — 2 ba cos at } 

(or - a 2 ) 2 + 4a 2 6 2 1 * 

To obtain the complete solution we must add to this the solution of (17). 
Hence 

y = *7 o oTo ; — ; 9 : o { (w 2 — a 2 ) sin at — 26a cos at \ 

(to 2 - a 2 ) 2 + 4 a 2 6 2 1 * 

+ e-^iaeP* + c 2 e~ Rt ) (2-24) 

It is seen that the complementary function decays exponentially with t 
and will be damped out eventually. It is therefore of little interest in 
physical applications. The amplitude of the oscillations, 

fo 

(ai 2 — a 2 ) 2 + 4a 2 6 2 

has a maximum when the impressed (angular) frequency has the value 

a = (co 2 — 2b 2 ) 112 


This is said to be the condition of resonance between the impressed force 
and the vibrating system. If 6 is zero there occurs what is sometimes 
referred to as the “ resonance catastrophe,” for in that case the amplitude 
is infinite when a = u. 

(a) Mechanical system . 

The present theory can be applied, for instance, to a mass m held in equilib- 
rium by a spring of stiffness k and damping constant L We then have, as 
in sec. 7c, 


b 


JL 

2m * 



Resonance occurs when 



l * y/« 

2 m 2 / 


(0) Electrical system. 


2.8 


ORDINARY DIFFERENTIAL EQUATI 


occurs when 


\LC 2 L 2 ) 


The solution (24) represents the charge, q , residing < 
instant. The current I is obtained by differentiate 
time. Both terms in braces then become positive, 

I = A[(a> 2 — a 2 ) cos at + 2 ba s 


where A stands for Eq a/L[(<a 2 


a 2 ) 2 + 4a 2 6 2 ]. 

in the circuit is / Eldt. This integral contains t 
«/o 

integrand sin at cos at, the other with the integral] 
these is 0 provided T is taken large enough to incl 

cycles 2ir/a, the last gives I sin 2 atdt = T/2. He 

•Jo. 


is 


AbaT 


The part of the current proportional to cos at causes 
it is a “ wattless ” current which is always out of pi 
electromotive force. 


b. Electrical polarization . 

An equation like (23) also describes the response of 
impinging electromagnetic wave. A light wave, 
polarized in such a way that its electric vector is ; 
upon an electron inside a refracting medium, will 
eE 0 sin at upon this electron. Here E 0 is the arr 
vector of the light wave, e the charge on an electron, 
light (assumed monochromatic). / 0 in (23) is then 
electron mass. The solution is given by (24). y r 
ment of the electron under consideration at the time 
dipole of moment ey. By “ polarization ” is meant 
unit volume of the material, and this is obtained on 
moment due to one electron by the number of dis 
unit volume. If this number is N, then the polarize 


i and the conductivity of the substance may be deduced very easily 
n this expression for P. 

2.9. Other Special Forms of Second Order Differential Equations.— 
Kn equation of the type 

d 2 ?/ 

^2 ~ f( x ) (2-25) 

be mtegrat<xl by the method of sec. 8. If this is done, only formula 
) is applicable, for the second formula (22) involves the quantity 



- T‘ 2 j which is zero, the auxiliary equation corresponding to (23) having 
al roots: r x « r 2 =* 0. The solution is 


v 


/■ 


<p{x)dx + Ci. + c 2 s 



dx + ci + CgX 


lh procedure in here very artificial, of course, for this result could have 
n obtained diro-otlv by integrating (25) twice. 

Example. .S' mjH'tmon bridge. 

wider the part of the cable l>etween A and the variable point P. It is in 
ilibrinm under the action of three forces: the horizontal force, H, the 
sion, T, at P, and the weight W of, or supported by, AP, which of course 
<1 not act at the middle of the segment. Hence we have 


T sin 0 


W; 


T cos e =■ 11 


tan 0 



w 

H 


is relation is true for every {Hunt P, provided W is the load between A 
i P. It is generally more convenient to write the equation in terms of 
= dW/dx, i.e., the load jxsr unit horizontal distance; w <=* w(x) : 

d 2 y _ w(x) 

.1 JJt II 


(2-26) 


2.9 


ORDINARY DIFFERENTIAL EQUATIONS 


In the case of the suspension bridge, the load is unift 
w = const. 

Solution: 

y = + C\X + c 2 , a parabola 

2H 


b. Equations not containing y. 
If the equation to be solved is 


<Py _ , / dy\ 

dx 2 dx) 


introduce the new variable p = dy/dx . The resulting 

dp 


dx 




can then be solved by one of the methods already disci 


Example. Cable hanging under its own weight . 

The equation describing the cable is (26), but w is nol 
case it is dW / ds , the weight per unit length of cable, 
provided the latter is uniform. Put dW/ds = X. The 

d 2 y ^ Xds ^ X I /dy\ 2 
dx 2 H dx H \ \dx) 

From this dp /\/ 1 + V 2 = (X/j H)dx, so that 

sinh~ 1 p = ~ x + ci 
H 

If the origin is chosen at the lowest point of the cable, c\ 

dy . ,X H ,X H( 

— = smh ~x) y = cosh — x + c = I co 
ax H X H X \ 

This curve is known as a catenary. 

c. Equations not containing x. 

d 2 y r ( dy\ 


11N XJ&tjJttAXlUiN J. IN 




ov 

The resulting equation 


p| -fter) 


is solved for p, then integrated once more. 

All linear homogeneous equations of the second order with constant 
coefficients discussed in sec. 2.7 can be solved by this method, but the 
treatment of sec. 2.7 is usually simpler. 


Example. Anharmonic oscillator. 

Differential equation: 

|f + co 2 *, + xy 2 = 0 

Solution: 

pdp = — (« 2 y + W)dy, v = ^ = (Cl - «V - f x 2/ 3 ) 1/2 

The integration of this equation leads to an elliptic function. 6 

Problem. Solve the equation for the anharmonic oscillator by successive approxi- 
mation, assuming that \y w 2 . 

Am. 

Xa 2 , 

y = a cos (at -f e) — — [1 — f cos 2 (at -f e)] 


INTEGRATION IN SERIES 

A type of differential equation occurring very commonly in physics has 
the form 

y" + X x y' + X 2 y = 0 (2-27) 

where Xi and X 2 are functions of x, the independent variable. Here and 
in the following, primes denote differentiations with respect to x. The 
methods developed in the preceding sections of this chapter are suitable for 
solving (27) when Xi and X 2 have special forms, but are farirom yielding 
solutions of that equation in general. In fact, such solutions are frequently 
not available in closed or finite form. For certain regions of x, however, 
they may be found in the form of convergent series by a procedure to be 
studied presently. 

8 See Peirce, B. 0., “ Short Table of Integrals,” Third Revised Edition, Ginn and 
Co., New York, 1929. Introductory treatments of elliptic integrals may be found in 

u Hicrhftr Mathematics” bv R_ S. Bnrinc-ton and O. (T Terrance McOraw-FTill Rnnlr 



3™?° behavior i n limited a remarks eoncer 

their behavior, it is often advisablTtoT “? 7 ** of VaIue - ^ 

“ ), which is always possible by means °f 6 ^ firSt derivativ e 

dependent variable. Instead of v Z Z Z * transfo ™at 

V, we introduce », related toy by 


y — ve~ifXi dx 


'V by 


C7 ** 

When this is substituted into j , 

cancelled, there results an equation f 0r a v d & 6Xponential factor 



inflexion wherever f( x ) ^P are ^ Provided r is finite, it has a 

two tacts are to be noted: If , is ^ r6gi ° ns "here 

will continually increase as * wf d haS a Positive slope : 
Positive and has a negative slone th CaUS “ g p to g">w rapMh 

Z Ste T em ’ Causin ^ to e a^e h ir iVe " ^ COnt -uSy 3 

upwards again. For negative ,, +h ^ axis and tiien iu genera] 

“the preceding sentence should 6 be^Z P ,° SitiVe ” and “ae 
behavior is most easily remembered if w “ t ® rchan S e d- This qui 
= const. = « 2 > o. TbeXl:T£: Uh « S ^^t 

- v * + C2e*~ WiC 

whjch typifies the foregoing remarks. 




61 


EXAMPLE OF INTEGRATION" IN SERIES 


2.11 


If, however, we consider a region in which /(x) < 0, the slope of positive 
v will be continually diminished. Thus if v starts out with positive slope 
this will soon be zero and then decrease until v = 0; as v then becomes 
negative its negative slope will increase until it is horizontal and v turns 
back toward zero. In short, v is oscillatory. This again is easily remem- 
bered if we consider the special case in which /(x) = — o> 2 < 0 for it has the 
solution v = c sin (a>x + 5). 

Fig. 5 illustrates these facts. To the left of A, v oscillates; at A it has a 
point of inflexion; to the right of A it is of exponential behavior. 

2.11. Example of Integration in Series. Legendre’s Equation. — 
To illustrate the method of series integration, let us postpone fundamental 
matters and start by studying a specific example. An equation of consider- 
able interest is Legendre’s; it has the form 

(1 - x 2 )y" - 2 xy' + 1(1 + l)y = 0 (2-28) 

in which l is a constant. We attempt to find a solution which is a series in 
positive powers of x. If the lowest power occurring is k, this solution will 
have the general form 

y = X2 ax^ rf_X (2-29) 

X“0 

Solving the differential equation then amounts to determining the coeffi- 
cients a\. Whether the series converges can be tested after this has been 
achieved. At present it will be assumed that this is the case, and that (29) 
may be differentiated term by term. When (29) is substituted in (28) 
the result is 

21 &x(k + X)(* + X — l)xf+*~~ 2 — 2 #x[(* + X)(k + X — 1) 
x x 

+ 2 (k + X) - 1(1 + 1 )]af+ x = 0 (2-30) 

This equation must hold for every value of x, and this can be true only if 
the coefficient of every power of x is identically zero. Since X cannot, by 
hypothesis, be negative, the lowest power of x occurring in (30) is x*“ 2 , 
and it is present only in the first summation of (30). Thus we find, put- 
ting X = 0 to obtain the term in question, 

a 0 K(K - 1) - 0 (2-31) 

a© is the lowest coefficient in our summation and hence not zero. Equa- 
tion (31) therefore determines k. It is often called the indidal equation . 
Clearly, two values of k are permissible: 


2.11 


ORDINARY DIFFERENTIAL EQUATIONS 


the foregoing, the coefficient of must vanish for every 

Now the term corresponding to the (k + j)-th power of x : 
first summation by putting X = j + 2, in the second b 
Hence 

+ j + 2)(ic + j + 1) = a,j[{K + i)(/c +j+l] 


or 


°i+ 2 * 


(k + j)(* +j+l) - 1(1 + 1) 


(*+j + l)(*+i + 2) 


Thus, if ay is given, ay+ 2 can be computed from this re 
with a 0y (32) permits us to obtain, successively, a 2 , a 4 , etc, 
arbitrary; it is one of the two arbitrary constants appeari 
solution of a second order differential equation. On the 
is assigned arbitrarily, all coefficients with odd subscripts ai 
(32). 

Choice 1. Let us take k = 0. Eq. (32) then reads 


j(j + 1) - 1(1 + 1) 

a j+2 — , . . . v , . . a 7 


O' + l)(i + 2) 

On taking a 0 and a\ as arbitrary constants, the solution be( 


0 


y = 1 1 
+ 


IQ + l) ja 6 


IQ + 1) IQ + 1) 4 , 

. x + 

12 2 


(* 


+ 2 - IQ + 1) ^ + 2-1(1 + 1) 12 - IQ + 1 


20 


_/' 1(1 + 1) , . i(i-2)(i + p(i + 3) 4 

V 21 + 4! X + .-- 


1(1 - 2) • • ■ (1 - 2r + 2)(1 + 1) • • • (1 + 2r - 
+ ( (2r)! 


+ 

+ (x 


) 


Oo 


( 2 


(1 - 1)(1 + 2) 3 , (1 - 1)(1 - 3) (1 + 2)0 + 


3! 


x‘ + 


5! 


, , iy (1 - l)d - 3) • • - (1 - 2r + 1)(1 + 2) • • • (1 
+ l ' (2r + 1) ! 


EXAMPLE OF INTEGRATION IN SERIES 


2.11 


v we take again ao and oi as arbitrary constants, we find 

2 - 1(1 + 1) _ 2 2 - Z(Z + 1) _ 12 - 1(1 + 1) 4 


■(' 

’( a 

5 — 


1 + 
f 

X -f- 

V 

f 

1 - 


■ XT + 


6 20 
6 - 1(1 + 1 ) 20 - 1(1 + 1 ) 


6 1(1+ 1) _3 , »\» I **.«' | J.) R 

XT * 


12 

(Z- l)(Z + 2) 

3! 

(Z-2HZ + 3) 
x + 


x* 4 ^ Oo 

x° + ■ ■ ^ a x 
CLq 

ai 


12 30 

5! ~ 1 ) 

(Z-2)(Z-4)(Z + 3)(Z + 5) 6 \ 

x s i 1 

(2-35') 


2 , (Z-l)(Z-3)(Z + 2)(Z + 4) 4 , 

XT r , X T 


360 


;erms multiplying a 0 in (35') are seen to be identical with those multi- 
g ai in (34 7 ) ; hence these two particular solutions are the same. The 
d part of (350, however, does not agree with the first of (34 / ), both 
dch represent series in even powers of x. It might seem, therefore, as 
: had obtained altogether three independent solutions, which is, of 
le, impossible. But closer inspection would show that the second part 
5 ; ) is not a solution at all. This is seen at once if, after assuming any 
fic value for Z, we substitute it back into the differential equation, 
trouble is that, putting k = 1 and a 0 = 0, we have carelessly discarded 
coinstant term which might appear in the sequence. The present 
pie indicates clearly that the solution of a differential equation is not 
together mechanical matter and that caution must be used at every 
Summarizing, we observe that the significant parts of (34') and 
are: 


Z(Z + 1) o _ L IQ-2)(1 ±m± 3) 


*ar + 


2! ~ ’ 4! 

Z(J - 2) ■ ■ • (l - 2r + 2) (Z + 1) ■ • ■ (Z + 2r - 1 ) 2r 

(2 r)\ 


z 4 + — + (_ X )r 

s 2 ' 


■] 


x — 


Oo 


(2-34) 


(Z-l)(Z + 2) , (Z-l)(Z-3)(Z + 2)(Z + 4) 

x t " w , x nr 


3! 


5! 


r_iw q ~ m - 3) • • • (l - 2r + 1)(Z + 2) • • • (Z + 2r) 2r+1 
1 (2r + 1)1 




( 2 - 35 ) 


2.11 


ORDINARY DIFFERENTIAL EQUATIONS 


One further point should be observed. When any one term in 
zero, all succeeding terms vanish also and the series becomes a pol} 
The conditions under which infinite series like (34) reduce to poly 
are of great importance in many physical problems and will be d 
more fully later. 

The work thus far has only established the fact that the series ( 
(35) are formal solutions of Legendre's equation, that is, they 
satisfy (28) if substituted in it. Whether the solutions are of any 
depends on their convergence properties. A series converges if the 
the absolute values of two successive terms, 

u j 

is smaller than unity for large j. Now this ratio is clearly 


But 

1 a >+2 1 
ay 

is immediately obtainable from (33). As > oo it becomes 1. 
the condition that (34) and (35) converge is that x 2 < 1, and this is 
long as | x | < 1. For values of x in the range — 1 < x < 1 our sol 
a significant one; for other values it fails. Is it possible to cons 
solution valid for | x 2 | > 1? This is indeed not difficult. 

Let us suppose that y , instead of being given by (29), has tl 
y = Eq. (30) will then read 

LaxOc - X)(* - X - 1K~ X ~ 2 

x 

-X>x[(* - X)(k - X + 1) - 1(1 + l)]xT x - 
x 

k now denotes the highest power occurring in the series. The indicn 
tion is obtained by putting the coefficient of the highest power of x < 
zero. Thus 

*(* + 1) - Z(Z + 1) - 0 



whence 


= l or — l — 


n replacing j by j + 2, 


(k - j)(k - j - 1) /n ^ 

a i + 2 ~ t ■ 0 \ / • ■» \ 7/7 _i_ i\ a i (2-36) 

(k — j — 2) (#c — j — 1) — t(t + 1) 

hoice 1. Let us take k - l. Eq. (36) then reads 

(l-j)d-j- 1) 

a ’ +2 ~ (j+2)(j-2l+l) aj 
is chosen arbitrarily, the series becomes 

u-jfi - jg-- i> x - 2 1 - m - - 3 > ^ 

y \ 2(2 1 - 1) + 8(2 1 - 1)(2 1 - 3) 

r _ 1Y (*-2r+l)(E-2r+2)---(l- l)i \ 

' 2r • • • 2(2Z - 2r + 1) • • • (21 - 1) + / 

series formally obtained by putting a 0 = 0 is of no interest since it 
fees the assumption, previously made, that k, i.e., I, represents the 
jst power of the sequence. We shall therefore omit it at once. 
hoice 2 . Let us take #c == — Z — 1. Then 

_ (j + Z + 1) (j + 1 + 2) 
a *" S (? + 2)(2* + j + 3) °’ 
ain we put aj = 0, there results the particular solution 


(2-37) 


r- 1 " 1 1 + 


(l + l)(H-2) _ 2 . (Z + 1) (Z + 2) (Z + 3) (Z + 4) 
x -f- 


2(2Z + 3) 2 • 

(l+l)...(l + 2 r) 

2r(2Z + 3) • • • (2Z + 2r + 1) 


2 • 4 (2Z + 3) (2Z -+- 5) 


(2-38) 


two solutions (37) and (38) are independent, hence their sum repre- 
the general solution of Legendre’s equation. It is easily seen to con- 
j if | x | > 1, unless l has such a value that the denominator of one of 
oefficients in the series vanishes. This case will be studied shortly, 
fo are now in possession of two forms of solution of eq. (28). The first 
34 and 35) converges when | x | < 1, the second (eqs. 37 and 38) 
l | x | > 1. Under special circumstances, however, (34) or (35) as 
is (37) or (38) may become polynomials, which remain finite for every 
i value of x . It is interesting to see what happens to the various par- 
iv solutions when this contingency arises. 

!q. (34) reduces to a polynomial when l is an even positive or an odd 
tive integer (or zero). 


. Let l be even and positive; l = 2k. (34) then becomes 


1(1 + 1) 


1(1- 2) 


■2(1+1)- 

l\ 


2.11 


ORDINARY DIFFERENTIAL EQUATIONS 


On the other hand, (37) becomes under these conditions 


y — ax 1 



m - 1 > x -2 
2(2 1 - 1) 


+ ••• 


( _iy/2 


u 

1 ( 1 - 2 ) • • • 2(1 + 1 ) • • 


These two solutions become identical if the second is mu 
constant factor 


( _ 1)Z/2 Id ~ 2 ) • - • 2(1 + 1 ) • • - (21 - 1 ) 


Hence the particular solution (34) coalesces with (37). 


b. Let l be odd and negative. Inspection shows that (34 
identical with (38). 

Eq. (35) reduces to a polynomial when l is an odd posii 
negative integer. 


c. If l is odd and positive, (35) reads 

( (l — l)(l + 2) 3 

V = a I X ^ or* + 


3! 


+ ( 1)f i_ n/2 (*-!)(; — 3)- ••2(f + 2)- • 


while (37) becomes 


y — ax 


(*- 


i(i - 1 ) 


+ 


2(2 1 - 1) 

+ (_l)U-l)/2 


l\ 


2 ■ 4 ■■■ (l — 1) (Z + 2) • • • ( 


These two expressions become identical when the second is 
the coefficient of its last term in parenthesis. 


d. If l is an even and negative integer (35) turns into (3f 
Having established these important relations between s 
(38) we now return to the consideration of (37) and (38). 
and (38) for integral values of l are of great importance in 
physics. If the constant a 0 in (37) is chosen to be 


purposes of reference we write it down again: 


1 • 3 • 5 • • • (2 l- 1) 

l! 


f ; Kl-l) ,- 2 ?(l-l)(l-2)(l-3) , 

1 2(21-1) 2 • 4(22 — l)(2l — 3) X 


(2-39) 


series here is to be continued down to the constant term. On the 
r hand, (38) with the constant ao chosen to be 2 i (Z!) 2 / (21 + 1) !, 
ng a positive integer, is often denoted by Qi- It is an infinite series: 


£! L-i-i . L ±_M± 2 ) -i-3 , 

1 • 3 • • • (21 + 1) l 2(2 1 + 3) 

(l + 1) • - • £ + 2 r) 

2*4* • • 2r(2// -f- 3) * • • (2Z -j- 2r -f- 1) 


— Z— 2r— 1 


+ 


(2-40) 


[be following facts will be noted : 

When l is a positive integer, (37) is a polynomial, but (38) is an infinite 
s. The general solution of (28) is a linear combination of (37) 
(38). 

When Z is a negative integer, (37) is an infinite series, and (38) is a 
nomial. The general solution of (28) is a linear combination of (37) 
(38). 

When 21 is equal to some positive odd integer, solution (37) degenerates 
(38). To see this, suppose 2Z = 2n — 1. There will then appear a 
shing denominator in the coefficient of x l ~ 2n and in every subsequent 
i of (37). To remove these infinities one may multiply the entire series 
n — r), which causes all terms of order higher than Z — 2n to vanish 
e the others remain finite. Hence the series begins with the power 
a and inspection shows it then to be identical with (38). 

lis case, our method has yielded but one particular solution, and this is 
ifinite series. Procedures leading to a general solution are discussed in 
bises on Differential Equations. 7 

When 2Z is equal to an odd negative integer, (38) degenerates into (37) 
manner similar to the above. In that case also no general solution can 
obtained by the present method. 

laving now given a fairly complete mathematical analysis of the solu- 
3 of Legendre's equation. w*e state some conclusions of practical impor- 
e. In almost all applications (cf. Chapters 7, 8, 11) the independent 
able x appearing in eq. (28) is the cosine of an angle. The functions of 
m est are therefore those which remain finite for all values which x - cos 0 
assume; these values include x = dtl. Such functions exist only when 

See Forsyth, A. R., “ Differential Equations,” Macmillan Co., London, 1914 . 


2.11 


ORDINARY DIFFERENTIAL EQUATIONS 


68 


l is a positive or a negative integer, as we have shown. But when l is an 
integer, consideration may be limited to solutions (37) and (38), because 
the others reduce to these. Moreover, inspection shows that solution (38) 
with l replaced by — (Z + 1) is the same as solution (37). Hence we may 
further limit our consideration to positive values of Z (including 0) and 
retain only (37) as a significant solution. Finally we note that (37) is 
identical with (39). Hence: 

In physio-chemical problems, where x = cos 0, the only solution of 
Legendre’s equation which is of practical interest is P t ( cos 0). 

Problems. 

a. Prove that, when l is an even negative integer, the expressions (35) and (38) 
become identical. 

b. Prove that, when 21 is an odd negative integer, expressions (37) and (38) become 
identical. 

Differential Equation for Associated Legendre Functions, or Associated 
Spherical Harmonics. 

An equation similar to Legendre's plays a considerable role in mathe- 
matical physics. It is 8 

(1 - x 2 )y " - 2 xy' + [Z(Z + 1) - y = 0 (2-41) 

where l and m are both integers, and has a particular solution: 

2/ = (l-z 2 r' 2 — P t (x) (2-42) 


The other particular solution is related to Q n and is of lesser interest in 
applications. To construct (42) by the method of series integration is 
perfectly feasible, but we shall here use a simpler method based on the 
foregoing results. If Pi(x) is a solution of 


then 


(l-x 2 )y" - 2xy' + 1(1 + l)y = 0 


dP 
d x m 


Pi(x) 


* The equation occurs more commonly in the equivalent forms 


d 2 y 


dy , r 


7/7 I f \ 


m 2 “1 


AJMkVJWL IT V KJAJkV-WJUl TT 14 VV 4. £ J J KJ%At VJUJUVU VAJkV VVjl Uk^ V4VAA 

- x 2 )Pl m) " - 2(m + l)xPi (m)/ + [1(1 + 1) - m(m + l)]P, (m) = 0 

(2-43) 

seen when Legendre’s equation is differentiated m times. Now let 

P[ m \x) = (1 - x 2 Yy (2-44) 

ietermine, by substituting this into (43), what differential equation y 
satisfy. After substitution, (43) will read 

x 2 ) r ”~ 1 { (4r 2 x 2 — 2 r — 2 rx 2 )y — 4r(l — x 2 )xy +• (1 — x 2 ) 2 y' f 
2 (m + 1)(1 — x 2 )x?/ + 4r(m + l)x 2 ?/ + [i(Z + 1) 
m(m + 1)](1 . — x 2 )y\ = 0 

re the special value r = —m/2 is chosen, this equation reduces to (41). 
iave shown, therefore, that (44) is true with r - —m/2 and hence that 

y = (1 - x 2 ) ml2 P\ m \x) 

rs asserted. The function P[ m \ which is a polynomial of degree l — m 
which satisfies eq. (43), is sometimes referred to by physicists as 
iholtz ’ function. The function (42) is known as an associated Legendre 
don, or more frequently, an associated spherical harmonic. 

.12. General Considerations Regarding Series Integration. Fuchs® 
>rem. — Before continuing, the reader will wish to know the limits of 
cability of the method applied in sec. 2.11, and in particular what 
erties of the solution one may read directly from the differential 
tion. First, then, let us ask the question: Will the method described 
c. 2.11 always work? In preparation for the answer, we consider the 
•ential equation 

y n + y/x z = 0 

►utting y - £axx* +x it is seen that 

X;ax(K + A)(k + X - l)x* +x ~ 2 - -£<zxx*+ x ~* 

x x 

indicial equation, obtained by putting the coefficient of the lowest 
;r of x equal to zero, simply reads 

a 0 = 0 

does not determine k. Furthermore, 

a j + 1 ==—(*+ j) (k + j — l)a,- 

.at a x = — a 0 (K — 1)*. Since a 0 = 0, this means that either k = <x> or 
also zero. In neither case do we get any solution at all. 


2.12 


ORDINARY DIFFERENTIAL EQUATIONS 


Equally instructive is the equation 



Its indicial equation yields k = 0. The recurrence relation 1 
coefficients is 


j(j - 1 ) 
j+l 


a,- 


Thus we have apparently determined a solution. But let us appl; 
vergence test. Denoting again the terms of the series by u r one see; 


lira 

n — ► co 



lim 

n — ► co 


<Z n +l \x n+l 
| dn \x H 


(n ~ 1 )n 

= Jim • x = nx 

n — ► oo n + 1 


This is greater than 1 as n — » oo for every finite value of x, so that th 
range of x at all in which the series converges. Again, the method 

To enlarge our outlook, let us now return to the general fora 

equation we wish to solve, that is, to eq. (27). As a rule there 

values of x for which one or both of the functions X x and X 2 

hiSnite. If x = z 0 is such a value, then x 0 is said to be a singul 

of the equation. It is at such singular points that the method of 

tion in series may break down. To be more specific, a solution of 1 

y XX — x Q )* +x may not exist at singular points x 0 . 
x 

In dealing with Legendre’s equation, a power series developro 
attempted about the point x 0 = 0. It succeeded because, after 
the equation in the form (27), neither X\ = — 2z/(l — x 2 ) nc 
1(1 + 1)/(1 — x 2 ) becomes infinite at x =0. But the points x = 
singular points of the equation, and it is for this reason that the 
solution obtained breaks down at these two points. Again, 
equations just considered, y n + x~ 3 y = 0 and y n + x~~ 2 y = 0 p 
singular point at x = 0, and this is the cause of the failure of the 
method. 

But while the method often fails if the differential equation has 
lar point at the place where the power series development is atten 
does not always do so. For instance, the equation 



y n + x l y* — x 2 y = 0 

4-V. wrvi At \ ' /i. mJC i X J r 


its just as well. For it says that every ay =0, except for j = 0. 

The corresponding solution is y = a 0 x. For k = —1 we have 

[0* - I) 2 - IK- = 0 

l this indicates that all coefficients must be zero except that corre- 
nding to j = 0 and to j = 2. Hence the solution is 

y = x~ l (a 0 + a 2 x 2 ) 

=> constants Oq and a 2 are arbitrary, which implies that the solution is a 
eral one, including y = const, x as a special case. Obviously, then, it is 
)ortant to settle what kind of singularities do, and what kind do not, 
mit an integration in series about the singular point. 

This issue is settled by an important theorem due to Fuchs, which states 
following: 

If the differential equation 

y" + X lV ' + X 2 y = 0 

sesses a singular point at x = then a convergent development of the 
ition in a power series about the point x = x 0 having only a finite number 
terms with negative exponents is nevertheless possible provided that 
— x 0 )Xi(x 0 ) ari d ( x ““ ^o) 2 X 2 (x 0 ) remain finite. 

This clearly is true for the equation 

l /" + x“V - x~~ 2 y = 0 

r 0 = 0, but not for 

v” + x-y - 0 

us the results just obtained are accounted for. The proof of Fuchs’ 
orem is a matter of some length and will not be undertaken here. 9 
conformity with the theorem singularities in Xi and X 2 occurring at 
= x % which are removable by multiplication by the factors (x — Xi) 
l (x — Xif respectively are called non-essential singularities of the 
erential equation; all others are essential ones. 10 All regular and non- 
mtially singular points are sometimes referred to as regular points of the 
erential equations (German: “ Stellen der Bestimmtheit ”). An 
Lation which has no essential singularities in the entire infinite complex 
ne is said to belong to the Fuchsian class of differential equations. 

9 See, for instance, Schmidt, H., “ Theorie der Wellengleichung,” Leipzig, 1931. 

10 Whether the point at infinity is an essentially singular one cannot at once be seen 
lis way. To examine it the transformation £ ~ 1/x must be made. One may then 
v that the point at infinity is essentially singular if X\x or become infinite there; 
non-essentially singular if 2x — X\x^ —> oo or X 2 X* —► 00 ; otherwise it is regular. 


2.13 


ORDINARY DIFFERENTIAL EQUATIONS 


72 


A final remark on the nature of the solutions obtained by the method of 
integration in series is in order. Even if the point at which the develop- 
ment is made satisfies the Fuchs conditions it may not be possible to obtain 
two independent solutions which, when combined linearly with the use of 
two arbitrary constants, will yield the general solution. If this process is 
to produce a general solution, further conditions must be met. Since 
general solutions are not often required in physical and chemical applica- 
tions, this matter will not be considered in^detail here. 11 We note, how- 
ever, that two independent solutions in the form yi = H2®\(z — Xo )* 1+x 
and y 2 - 12 a \ ( x — Zo)* 2+X can always be obtained when the two roots of the 
indicial equation, and * 2 , do not differ by an integer or by zero. 


SPECIAL EQUATIONS SOLVABLE BY SERIES INTEGRATION 
2.13. Gauss’ (Hypergeometric) Differential Equation. — 

(x 2 - x)y" + [(1 + a + fix - y]y' + afiy = 0 (2-45) 

The parameters a, y are constants, and it will be assumed that y is not 
an integer. Eq. (45) has singularities at 0, 1, and <*> , but they are all non- 
essential. On development about x = 0, the indicial equation reads 

k(k — 1 ) + Ky = 0 


hence * = 0, 1 — y. Choosing k = 0, we obtain the recurrence formula 

(*+j)(P + j) 


a i+ 1 


0 + 1)0 + y) 


as 


(2-46) 


and hence the particular solution 


y = a 




+ 


a(a + 1 ) 


y 1 

• • (a + r - 


2 • y(y + 1) 

i) • m + 1) 


(P + r-1) 


r\ y(y + 1) • • • (y + r — 1) 


x r + - 


(2-47) 


The series in {} is known as the hypergeometric series. It converges if 
| x | < 1. For a - 1, = y it reduces to the ordinary geometric series; 

hence its name. It is customary to denote the hypergeometric series by 
F(a,l3,y]x). With this abbreviation, then, this particular solution is 

y = aF(a,0,y;x ) 


Next, we take k = 1 — -v. The recurrence relation reads 


GAUSS' (IIYi’KHUBOMKTKIC) DIPKBKBNTIAL UQUA i'lON 


2.13 


;en the new constants : a = a — y + 1, fi' » /S — y + 1, y =* 2 — y, 
introduced in (48) it becomes 

(<*' + i) (&' + j) 

Uj+1 6 + ~l)(j + y'j Ui 

t is, it takes the same form as (46). The particular solution corns 
riding to (48) may therefore be written 

ox l ^F(a — 7 *+- 1, /J — 7 + 1, 2 — 7; x) 
have thus arrived at the following general solution of (45): 

, - AF(a,f),y;x) + Bx l ~'F(a - y + 1 , 0 - y + 1, 2 - y; x) (2 49) 
>se range of convergence is | x | < 1. 

There Is an interesting and sometimes useful relation between the 
itions of Gauss’ and those of Legendre’s equation. Let us introduce in 
) the new independent variable (, given by 

x = ^(1 — {) 

;hat it takes the form 

/l*«i (ty 

- i 2 ) -15 + [1 + <* + fi ~ 2 7 - (a + P + l)Q -f - afiy - 0 (2 SO) 

dr at 

8 reduces to Legendre's equation (28) if we specify the constants to be 

a « ! + 1 , 0 ^ —lj 7 ;~-x 1 

s particular solution of Legendre's equation is therefore 

v-of(i + l, -1,1; 

m the fact that this solution, expanded in powers of £, starts with a 
stant term it is clear that it must be identical (aside from a constant 
,or) with (34). In particular, if l is a positive integer, it must be /'*, 
s happens to be true, as the reader may verify, oven with respect to the 
stant factor if Pi is defined as in (39). Thus 

Fi (f) - F + L, —l, 1; t"-) (251) 

An equation known to mathematicians as Txchebyscheff’n results when 
50) we specialize the constants as follows: 


2.14 


ORDINARY DIFFERENTIAL EQUATIONS 


74 


Its solution is clearly 

y© = AF (n, + B V (n + §, -n + f; 

(2-53) 

The first particular solution here written is a polynomial known as the 
Tschebyscheff polynomial, of degree n. If multiplied by the proper factor 
it has the alternative form : 


Tn(*) 


= 2 n '~ 1 ( x n 


n 


1 ! 2 2 


y.n — 2 


+ 


n(n 


2 ! 2 


i> »-4 

L ** 


ra(n - 4)(n - 5) n _ 6 
3! 2 6 * 



(2-54) 


This development stops with a constant or a term proportional to x. 

The function F(a,l3jy;x) reduces to a polynomial when a = — n, 
n being a positive integer, as may be seen from its definition (47). The 
resulting polynomial, which is of degree n , is known as a Jacobi polynomial , 
defined as follows : 


Jn(P)<2)%) — P “b n,, (2—55) 

It satisfies the differential equation 

(x 2 — a:)?/ 7 ' + [(1 + p)x - q]y r — n(p + n)y = 0 (2-56) 

in which q must satisfy q > 0. Substitution of a = —n, /3 = p + n, 
into (47) shows that 12 

Jn(p,q;x) = 1 + 

T f ivY n V p + n )( p + n + (p + n + X- 1) x 

x-i ^ j W 8(ff + 1) • • • (j + X - 1) * 


Problem. Find the solution of (45) about the point x — 1; i.e., find solutions of 
the form 

y ~ ( x ~ l)* +x 
x 

4ns. 

y = 4F(a,&ae-H?~-7-H; 1 — x) 4* #(1 — a:) Y_a “^F(7~/5, 7— a, 1— a— l—x) 


2.14. Bessel’s Equation. — 

+ (; x 2 — n 2 )y = 0 (2-57) 


n is a constant. Since the equation is regular at x = 0, its solution may 



BESSEL'S EQUATION 


2.14 


is the two roots k — =fcn. According to the remarks at the end of sec. 2.12 
3 can obtain two independent particular solutions if 2 n is not an integer; 
it is, the method may allow us to determine only one. Taking k = n 
le finds 


= a 0 x n i 1 - 


+ (~D r 


+ 


2(2n + 2) 2 • 4(2n + 2){2n + 4) 

^2r 


2 • 4 • - • 2r(2n + 2)(2n + 4) • • • (2 n + 2r) 


+ 


3 r k = —n 
= a 0 x~ n { 1 + 


+ ■ 


+ 


+ 


2(2 n - 2) 2 • 4(2 n - 2) (2 n - 4) 

x 2 ' 

2 • 4 - • • 2r(2n - 2)(2n - 4) • • • (2 n - 2 r) 


+ 


(2-58) 


(2-59) 


"hen the constant a 0 in (58) is chosen to be 13 l/[2 n r (n + 1)], the resulting 
:pression 


oo ^ -j^X / x \n+ 2 \ 

v = Jn(x) = x? 0 r(\ + i)r(x + n + lj \2/ 


(2-60) 


called a Bessel function of order n. 

When (59) is multiplied by the same factor it becomes n (x). Hence 
Le complete solution of Bessel's equation (when n is not an integer) is 

y = AJ n (x ) + BJ_ n (x) (2-61) 


ispection of (58) and (59) shows that no difficulty arises when n is half- 
tegral, although the difference of the roots of the indicial equation is an 
teger. But if n is an integer, J— n is no longer independent of J n . For 
that case the coefficient of x n in (59) has a vanishing term in the denomi- 
itor, and every subsequent coefficient likewise becomes infinite. Multi- 
lication by the vanishing term makes every term preceding the n-th zero, 
he series then starts with x n and is seen to be identical (except for a 
rnstant multiplier) with (58). For integral n, therefore, we have obtained 
ily one solution, namely J n (x). 14 By choosing the constants A and B of 

15 The Gamma function appearing here is a generalization of the factorial n! which 
defined only for integers (and zero). If n is an integer, r(n + 1) = n!. In general, 

J f* 00 

e~H*~ l dt) it is easily seen to reduce to n\ when x—n. Moreover, this in- 
o 

igral defines the “ smoothest ” function which takes on the values n! at the integers, 
f. sec. 3.2. 


2.15 


ORDINARY DIFFERENTIAL EQUATIONS 


76 


(61) suitably, several particular solutions of Bessel’s equation (such as 
Neumann’s and HankePs functions) having useful properties may be con- 
structed. They will be discussed in sec. 3.9. 

2.15. Hermite’s Differential Equation. — 

y" — 2 zy r + 2 ay = 0; a = constant (2-62) 

The roots of the indicial equation are k — 0, 1; the recurrence relations 
between the coefficients 

2 (k + j) - 2a 

j+2 (* + j + 2) (k + j -f 1) 1 


For < = Owe find the solution 


“ w V 2! 4! 6! 




. r tt(a — 2) • • • (a — 2r + 2) 


(2r)! 


x zr + 


) 


+ (~2 ) r 
while for k SB J 

+ - 2r + + . 


(2-63) 


(2r + 1)! 


•) 


(2-64) 


The general solution of Hermite’s equation is a superposition of these. If 
a is an even integer n, (63) reduces to an even polynomial of degree n. 
On choosing for oq the value 

(-1, " ra 7 ?r 

(i) 

this polynomial becomes 

H n (x) = (2*) n - n(n ~ l] (2x) n ~ 2 


+ 


n(n 


1 ! 

l)(n — 2) (re 


3) 


2 ! 


(2xY 


(2-65) 


and this is known as the Hermite ■polynomial of degree n. If a is an odd 
integer, n, (64) reduces to an odd polynomial of degree n. In fact if we 
choose for an the value 



7 LAGUERRE'S DIFFERENTIAL EQUATION 2.16 

An equation very similar to that of Hermite is 

y" + (1 - x 2 + 2a)y = 0 (2-66) 

or if we put y = <f* 2/2 t;, so that y" = { (x 2 — l)t> — 2xv' + v >, }e~ x ‘‘ 12 , 
ie equation turns into 

v" - 2xv f + 2ctv = 0 


hich is identical with (62). Hence the solution of (66) is simp ly any 
)lution of Hermite ’s equation, multiplied by e^ 2/2 . 

2.16. Laguerre’s Differential Equation. — 

xy" + (1 — x)y + ay = 0; a = constant (2-67) 


is a non-essential singularity at the origin. Developing about x ^ 0, 
Le indicial equation has the single root k = 0. Only one solution will be 
)tained, this being of considerable importance in physics. The reeur- 
nce relation reads: 

ai+1 = (j + l ) 2 ay 

mce 


= Oo (l - 


ax + 


a (a — 1 ) 

(2 !) 2 


s 2 - 


+ (-D ; 


, a(a — 1) • • • (a — r + 1) 

(Tip 


x r + 


( 2 - 68 ) 


tiis expression becomes a polynomial when a = n, a positive integer. On 
itting 

oq = (— l) n n ! 


d for integral n, y becomes the Laguerre polynomial of degree n: 

n2 ( n ~ l) 2 n -2 


L„(X) = (-ir^-^x'*- 1 + 


2 ! 

+ (-l) n til) 


+ 


(2-69) 


A differential equation at once reducible to Laguerre’s is 
xy" + (k + 1 — x)y' + (a — k)y = 0, k an integer > 0 (2-70) 


results when (67) is differentiated k times and y is replaced by its k-th 
rivative. Hence a solution of (70) for integral and positive a and k is 


n — /c. 

A third function closely related to the Laguerre polynomials S£ 
the differential equation 

n f . f k — 1 ® fc 2 — ll 

xy + 2y + \n 2/ = 0 

If we substitute in this equation y = e~ xl2 x {k ~ l)!2 v, then v is seen t 
solution of 

xv n + (fc + 1 — x)v' + (n — k)v = 0 

Comparison with (70) shows, therefore, that v = L k (x). Hence a pa 
lar solution of (71) is 

y = e~ xl2 x {k - 1)l2 L k n {x) i 

This function is known as an associated Laguerre function; it is of 
importance in the theory of the hydrogen atom. We observe that : 
(71) were not an integer but any constant a , the corresponding solut 
(71) would be 

y = e - x, 2 x Oc-l)l2 L a (x) 

Q/jC 


where L a is written for the series (68); provided, of course, that 
positive integer. This solution would no longer be a polynomial 
multiplied by e~~ x/2 , but an infinite sequence. 

2.17. Mathieu’s Equation. — In the previous sections attentio 
been given to differential equations in which X! and AV 5 were alg 
functions of x. Equations sometimes arise in which these fun 
are periodic. The simplest instance of these is Mathieu’s equation , u 
written in the form 


d 2 y 

— 2 + (a + 166 cos 2 x)y = 0 


( 


where a and 6 are constants. Its general solution may be obtained 1 
method of integration in series if the substitution 

£ = cos 2 x 

is made. (73) then reads 

4$(1 ~ i) S + 2(1 - 2{) ^ + (a - 1 6b + 32 :btj)y = 0 ( 

15 Defined by eq. (27). 


MATH JEW'S EQUATION 


2.17 


vtiuti has a non-essential singularity at $ - 0 and can therefore be 
'eloptsl as a power scries about the origin. On inserting 

v - ixr +x 


(74) we obtain 

:iK + \)(2 k 4 2x - n«xr +x “ l - ew* + x) 2 -<* + 16 ^ 

+ K2hZ«xr VM1 --'0 

X 

r ,.*,,re arista which was not encountered before; the equation 
Tains different summations instead of two and will therefore lead 
Thriv-torm recurrence relation between the coefficients instead of the 
’ a t’ .rm relations that occurred in the former instances, lhis, however, 

V ° u, l iieation of procedure, except that it will force us to advance 

? qU W b^d hi t e computation of the coefficients. Only the first summa- 

2 ... «»■ «**«*• - r*. — be — He "“ 

he indicial equation is formed as before: 

This leads to 

2(k + 1)(2k + D«i - ( 4k ~ a + 105)00 

so th&t 4k 2 - a + 16b 

“.-i^TnSTTT) 00 

I ,0.0. run, ,0(1 when the arbitrary constant a 0 is 

.0 au 

— ** _ . + 1WK + 32b«o - 0 

2(k + 2) (2k 4- d)«2 " I4(t + l > 

f ,.x, 0 i n this way two series can 
a relation permitting the caieuia -ion o ^ = ^ linear compos ition of 

be constructed, one for « - ° ^ J of (73) . Investigation 
which yields the general «olution of ( ^ ^ ^ 

shows that this solution converges |t I inter est in physics and 


2.17 


ORDINARY DIFFERENTIAL EQUATIONS 


solution here found, which is of the form 

2>x£ x + * 1/2 Z&x* x 

X X 

does not possess this periodicity, as closer investigation woul 
Qualitatively this defect is apparent from the failure of the sol 
converge for £ = dbl, which excludes the values x = mr from cc 
tion altogether, as well as from the existence of a branch point 
at £ = 0 (arising from the factor £ 1/2 ). 

In fact it is impossible to obtain solutions of Mathieu’s equatic 
are periodic and of period 2ir in x y unless definite restrictions ar 
upon the constant a . It turns out that the latter must be a con 
function of b if the solution is to be periodic. 16 

Floquet’s Theorem . An important theorem concerning the 
solution of Mathieu’s equation, or indeed of any linear differential < 
with periodic coefficients which are one-valued functions of x } wil] 
established. Suppose that yi(x) and y 2 (x) are two linearly inde 
solutions of (73), so that any particular solution y may be com 
from them by means of two constants A x and A 2 as follows: 

y = Am + A 2 y 2 

Now it is clear that, if y x (x) and y 2 (x) are solutions of (73), y±( 
and y 2 (x + will also be solutions, for the substitution of x 
place of x causes no change in the differential equation. This : 
course, not be interpreted as implying that yi (x + 2tt) = y x 
y 2 {x) = y 2 (x + 2 t); but it does mean that 

yi(x + 2 t) = anyi(x) + ai 2 y 2 (x); y 2 {x + 2i r) = a 2 \yi(x) + a 

the a’s being constants. Similarly, using (76) 

y(x + 2t r) = Aiy x (x + 2?r) + A 2 y 2 (x + 2tt) 

= (Aiau + A 2 a 2x )y x (x) + (A x ai 2 + A 2 a 22 )y 2 0 

We observe that the constants a are fixed by the choice of y x and 
A i and A 2 may be chosen at will and still leave y a particular soluti< 
equation. It is possible to choose them so as to satisfy the equatior 

Aian + A 2 ot 2 i = hA\) A\ol \ 2 + A 2 a 22 — kA 2 
where k is a constant not within our control, for if eqs. (77) are to 


fied then k must be subject to the equation 


an — fc a 2l 

«12 <*22 ~ k 


= 0 


But if (77) holds then 

y(x + 2t) = k[A x yi(x) + A 2 y 2 (x)] = ky(x) 


(2-78) 


(2-79) 


In other words, there exists a particular solution y{x) such that, when x is 
increased by 2?r, the solution itself is multiplied by the constant k. If k 
were unity, this solution would be periodic. 

This result may be expressed in a different way. On putting 

k = e 2irM , y(x) = e M *P(x) 

eq. (79) reads 

^(*+27 r)p( x + 2 tt) = e 2v ^ x P(x) 


so that P(x) turns out to be a periodic function. Thus it is seen that there 
exists a particular solution of Mathieu’s equation of the form 

y = 4*P(x) (2-80) 

where P is periodic. From here it is only a simple step to obtain a general 
solution of (73). The differential equation is insensitive to the substitu- 
tion of —x for x . Hence e~^ x P{—x) must also be a solution. Moreover, 
it is an independent solution, since it is not a constant multiple of (80). 
The complete solution is, therefore, a linear combination of these two: 

y - ci#*P{x) + c 2 e-^P(-x) (2-81) 


This result, known as Floquet’s theorem, is of interest in some astronomi- 
cal applications and chiefly in the quantum theory of metals. 17 

Problem. Show that the Schrodinger equation 

^ + + F(*)J* - 0, 

in which A is a constant, and V is a periodic function of x such that V(z + l) = V (x), 
has solutions of the form 

* = «“*»(*), 

where vis also periodic: v(x + 0 = v(x). 

This is sometimes called Bloch’s theorem. 18 

17 See Seitz, F., “ Modern Theory of Solids,” McGraw-Hill Book Co., New York, 
1940, Chap. VIII. 

18 Bloch, Y..Z. Physik 52, 555 (1928). 



thermodynamics are peculiar masmucu as uuey usuau 


dW = L X x dx x 
x=i 

where the X\ are functions of some or all the indep< 
While (82), which is known as a Pfaff expression, is noi 
tion of the customary kind, its importance in chemistry 
consideration. It is for lack of a more adequate place 
inserted in the chapter on differential equations. S< 
which will be developed from a mathematical point of 
has already been used in Chapter 1, to which reference 
further applications. The equation 

£ X x <fcx = 0 

x=i 

is sometimes called a total differential equation or, mo 
equation. 

Clearly, the expression dW, eq. (82), can be integr; 
in n-dimensional space, but the integral will in general 
of integration. (See Prob. a, p. 87; also the ex; 

When J dW depends on the path of integration, dW is 

or inexact. 

The condition that (82) be a complete differential 
dW = df(x lX2 • • • x n ) 

for then / dW =/( r 2 ) — /(rj), independently of pa 

df = £ dx x 
x dx\ 


Comparing with (82), we find 



To state this relation without explicitly introducin 
differentiate it with respect to ju ^ X. 

dXx _ 3 2 / 

dX a dX\dX^ 


PFAFF DIFFERENTIAL EXPRESSIONS AND EQUATIONS 


2.18 


t also 

= a 2 / 

dx\ dx M dx\ 

:nce the necessary condition of “ exactness ” may be written in the form 


dX\ dXf, 

TT = > X > * = 1 


dx „ dxx 


(2-84) 


e reader who is already familiar with vector analysis will note that, if 
i X\ are interpreted as components of a vector R, (82) may be written 

dW = R • dx (2-82') 

d the condition of “ exactness ” becomes 


dy dx dz dx dz dy 


V X R = 0 (2-84') 

ese results are of importance in vector analysis where they are usually 
Dressed as follows: The condition that the line integral of R (expression 
f ) around any closed curve shall vanish is that R be the gradient of some 
ilar function, and this is equivalent to condition (84/). (Cf. sec. 4.17.) 
We return now to the general situation : 

dW is not exact 

d distinguish two cases: 

A The equation dW = 0 has a solution. 

B The equation dW = 0 does not have a solution. 

A. The equation dW = 0 possesses a solution. Leaving aside for the 
>ment all considerations as to when such solutions may be found, we shall 
>t sketch the consequences of the existence of solutions. The equation 
7 = 0 assigns to every point a direction , or, what amounts to the same 
mg, an element of surface . (From the point of view of vector analysis 
ls is immediately clear because the relation R • dr specifies at every point 
l ■ • • x n ) the direction dr which is perpendicular to the vector R.) 

When integrated, the equation dW = 0 leads to 

4>(xi£ 2 * * * x n ) = c (2-85) 

uch represents a one-parameter family of surfaces in n-dimensional 
ice. These surfaces consist of the elements specified by dW = 0. 

Tj 17*/5 /vo ts /ww oil “f n h s\r% m a-/ n J * . _ i 


and dW = 0. The same is true along a neighboring surface </> = 
Suppose we wish to go from A to C. The change occurring in < 
no matter whether the crossing is made at Pi or at B 2 . But the cha 
will depend on the path. The important point to note is that no 
occurs in W as we pass along either curve; a change can occur onl;j 
crossing: dW = function of the point at which the crossing is mac 
dW 7 * 0 along the two curves, then it would depend on the whole pa 

merely on the point of cr< 
Hence dW = t(B)d<t> , whei 
the point of crossing. Hence 
t(x i • • • x n )dcj>, or 


But d(j> is an exact different 
Along the curves 0 = const., the equation F(0) = const, will li 
be satisfied if F represents a unique, siiigle-valued function. If, tt 
use F(0) in place of 0 in the preceding analysis, we are led to 




instead of 


d<f> 


dW 

t 


Since, however, dF = (dF /d<t>)d(j>, we see that T = t/(dF/d<j)) is i 
integrating denominator. It is clear that, if there exists one intei 
denominator t for a Pfaff expression, an infinite number of others 
formed by the above rule. 

Only the points on the surface 0 = c are connected with A bj 
along which dW = 0. It is clear that in the neighborhood of A the 
infinite number of points not connected with A by such paths. He: 
fact, important in thermodynamics (though somewhat trivial ge 
cally !) : 

If the inexact differential dW possesses an integrating denominate 
there exist , in the neighborhood of every point P, innumerable point 
cannot be reached from P along paths for which dW — 0. 

We now consider the question of how to find the integrating denor 

1. Case of two variables . First solve the equation 
dW = 0; Xdx + Ydy == 0 

The solution is 

V = or 0(z,y) » c 

Along the curves (87), <t>xdx + <t>ydy = 0, hence 

_ 0 * 
dX <f)y 


t trom (86) 


dy = _X 
dx Y 


d* X <t> x dy /x 

--j, or - -- -«(**) 


(2-89) 


d<t> = — — + <t>ydy = uXdx + uYdy = udW 


1 _ X _ Y 

U 4> x <t>y 


(2-90) 


2. Case of three variables. First solve 

dW = 0; Xdx + Ydy + Zdz = 0 

e solution is 

= c 

Along these surfaces, <f> x dx + <j>ydy + <f> z dz = 0, hence 
dy _ <]>x dz <j) x dz < 

dx z dx y (f) z dy x * 

t from (91) 


dx 

dz 

_dx 

dz 


dx y 

dz 

dy 

X 

dz 

X 

dz 

F’ 

dX y 

Z’ 

dy 


(2-91) 


_ A 4>x _ A <t>y Y 

<j>y Y <f) x Z 1 <t> z Z 


<I>X <l>y 4>z t \ 

X ■ r - z - u(x »’ z) 


dxfr = = <f>xdx + <^2/ + — u(Xdx + Fc^ + Zdz ) 


srefore 


^ <f>y 4>z 


I'll o t*1tt 






2.18 


ORDINARY DIFFERENTIAL EQUATIONS 


86 


We now consider the condition that the equation 

dW = 0 


shall have a solution . ( Condition of integrahility . ) 

Suppose a solution of 'E,X\dx\ = 0 exists in the form 


Then 


<t>(x i • * • s n ) = c 


u(X\ * * * Xn)Xi 


d<j> 

dXi f 


i = 1, 2, • • • n 


(2-92) 


Let ij j } Jc, be different indices. It follows from (92) that 


whence 


Similarly, 


u 


u 


± (ux,) -£--f c«y<> 

dXi dx t dXj dXj 

f dXj dXA _ du du 

Vdrr* dXj ) 1 dXj 3 dXi 

/ dXi _ dX k \ X k — ~ X dU 
\dXjc dx i ) dXi 1 dx k 

= x j --x k 

\dXj dXk / dXk 


dXj 


Multiply the last three equations by X k , Xj, and X { , respectively, and add* 



d_X\ 

dx, k ) 


= 0 


(2-93) 


By closer analysis, this equation may be shown to be both necessary and 
sufficient : it represents the condition of integrability for the Pfaff equation 
iW = 0. In three variables, eq. (93) takes the form 


R- V X R = 0 


provided R is interpreted as the vector having components * 1 , X 8f X 3 . 
The total number of equations of the form (93) is equal to the number of 
triangles that can be formed with n given points as corners ; it is therefore 
ln(n — 1) (n — 2). These equations are therefore not independent. 

It is to be observed that, in the case of two variables, eq. (93) is always 
satisfied. Hence every Pfaff equation of the form 


iables, where the solutions can be visualized easily in ordinary space, 
leralization to more variables introduces no complications. It will be 
n that “ improper ” solutions of eq. (82) are still possible, but that they 
resent a greater variety of functions than the proper solutions considered 
the preceding paragraphs. 

We now choose an arbitrary relation 

'K*,y,z) = 0 (2-94) 


1 impose this upon eq. (82), thereby effectively eliminating one degree of 
sdom. From (94) and its differential form 

$ x dx + \pydy + yj/zdz = 0 


variables z and dz are obtained in terms of x , y, dx, dy , and these solu- 
is are substituted in eq. (82). It will then be of the form 

Xdx + Ydy = 0 

1 this has a solution 




(2-95) 


e improper solutions of (82) are said to be those curves which satisfy (94) 
1 (95) simultaneously. They represent, therefore, prescribed curves 
>n arbitrary surfaces. Further investigation would show that every 
nt in the neighborhood of a given point can be reached by a continuous 
ve satisfying (94) and (95) from the given point, the state of affairs being 
te different from that described under A. 


Problem a. Let dW 

bs: 


x(dx + dy ). Compute the integral I dW along two 

J * 11/1 


1. xiyi x 2 yi -+ x 2 y 2 . 

2 . x\y\ X\y% — > X2Vv. 

Show that the two results differ by the area enclosed by the two paths of integration. 


Problem b. Show that the expression 


dW — —ydx + xdy -f kdz == 0 
ire k is a constant, does not possess an integral. 19 
19 See Bom, M., Physik . Z. 22, 250 (1921). 


Rainville, Earl D., “ Intermediate Course m Differential Equations/ John Wit 
Sons, Inc., New York, 1943. Clear and understandable discussion, particul 
equations of Fuchsian type. 

Ince, E. L., “ Ordinary Differential Equations/' Dover Publications, New York 
More advanced than Rainville but much easier to follow than Forsyth, foot: 

Kamke, E., “ Differentialgleichungen, Losungsmethoden und Losungen, I 
Gewohnliche Differentialgleichungen/' Third Edition, Chelsea Publishin 
New York, 1948. Remarkable reference book, with more than 1500 diffe 
equations, together with their solutions and some comment, arranged so tl 
can readily find the solution of a given equation. 


CHAPTER 3 
SPECIAL FUNCTIONS 


U. Elements of Complex Integration. Theorems of Cauchy.— Some 
laintance with the calculus of complex variables facilitates this work; 
ie the present section and the next will outline the elements of this 
nl subject. 

b to notation, i 2 = — 1, and the symbols x and y are used for single 
variables. Furthermore, 

z = x + iy = pe l 

>ugh the Argand diagram , which consists of a real axis along x, an 
binary axis along y , and presents z as the point with rectangular co- 
llates x and y, this last relation is at once made clear: p 2 = x 2 + y 2 , 
tavT l y/x. 

Jow let / (z) be a single-valued function and analytic in the sense that 
s a unique derivative with respect to both x and y at every point of the 
md plane. We may then write, in terms of two new analytic functions 
d v, 

f(z) = u(x,y) + iv(x,y) 

2 e follows an important result. Since 

Q ^ dv 

dx dz dx dx 

^ df _ du . dv 

dy dz dy dy 

inds on" equating real and imaginary parts of dfldz that 


3.1 


SPECIAL FUNCTIONS 


These are the famous Cauchy-Riemann conditions which the compon 
u and v of an analytic f(z) must satisfy. Further differentiation yi 
another important set of relations, which we shall often use hereafter: 


d 2 u . d 2 u 
dx 2 dy 2 


d 2 v d 2 v __ 
dx 2 dy 2 


Cauchy’s theorem asserts that the integral 

f c Kz)dz = 0 


provided it is taken along a closed curve C on and within which / (2 
analytic. A simple proof is as follows. 


£ mdz " X [u (dx + idy) + iv (dx + idy)] 

= jT (udx — vdy) + i J* (vdx+ udy) 


By virtue of the Cauchy-Riemann conditions, both of the final integr* 
are exact in the sense of sec. 1.7, and the line integral around the cl 
contour vanishes. 

Equally important is an extension of Cauchy's theorem to which 
now turn. Suppose again that/(z) is analytic within and on a closed ci 
C in the Argand plane, and denote by z 0 a fixed point within C. The f 
tion f(z)/(z — z 0 ) will then have a singularity at z 0 and its line inte 
along C will not be zero. But the value of this integral will remain 
changed if we alter C, so long as the contour does not cross z 0 . This fol 
at once from Cauchy's theorem, for the difference between the old anc 
new value of the integral will itself be a line integral around a region 
which /(z)/ (z — z 0 ) is analytic, and will therefore vanish. Let us de 
the infinitesimally small circle of radius p surrounding the point z 0 by 


jf^U - /«£>-& -smS— =/<*»>/ 

«/c 2 — 2q i/r 2 — z 0 Jr z — z 0 Jr 


d{pf) 

pe ie 



dd = 2 idj{z 0 ) 


THEOREM OF LAURENT 


3.1a 


snceforth we shall understand that 

led positive) integration along a 
>sed curve C. 

3.1a. Theorem of Laurent. Resi- 
es. — Let f(z) be analytic on the ring 
med by two concentric circles C\ and 
including these boundaries. (See 
l. 1.) Apply eq. (1) to the point 
= f , choosing a contour which goes 
un A along C 2 to B, thence inside 
Ci, around Ci in a negative sense, 
d finally back to A. The two hori- 
ital portions of the path make equal 
d opposite contributions to the in- 
;ral and therefore cancel. Hence 


I denotes counter-clockwise (so- 



m 


If* 

2i nJc 7 z — f 2tiJc x z — f 


(a) 


the first integral we may write (z 0 now denotes the common center 
Ci and C 2 ) 

i = _i i = i - n - zq Y 

Z-f z - z 0 ^ £ - Zp Z - z 0 X =0 \Z — Zq) 
z - z 0 

d obtain a series which converges on C 2 . Therefore 


j_ r /( g ) 

2 iriJCjZ — f 




£ a x (f - *o) X 

X *0 


(b) 


Dvided we define the coefficients 


1 f f(z)dz 
ax 2iri Jc t (z — z 0 ) x+1 

the second integral of (a) we use the convergent expansion 

i i i i r( 2 ~ 2 °Y 

f - z f - Zq _ 2 - Zp f - Z 0 X \f ~ «o/ 
f ~ Z 0 


(c) 


that 


provided 


h = 


f f(z) 0 - z 0 ) x dz. 
2m Jc , 


Now put X = —fx — 1. Then b\ becomes a M , except that 1 
in b\ is along C\, in a M along C 2 . But the integrands of (c 
no singularities within the ring, hence the path of integratioi 
the same in both of these expressions. We may indeed ta] 
curve, C, within the ring, which encloses the point f. 
the sense of C 2 is opposite to that of C\, b_ M -i = —<V 
Eq. (d) now reads 


JL r m 

2m Jci z — f 


= Z - z o Y 

p. = — so 


When we add (b) and (f ) to form (a), and replace f by z in th 
we find 


with 


00 

/0) = Z a x( 2 _ 2 o) X 

X =— QO 


a) 


J_ r 

2mJc (z — 2 0 ) x+1 


b) 


Eq. (2) is called Laurent’s theorem. It shows that a functioi 
singularities on a circular ring, can be expanded as a Lauren \ 
which contains negative as well as positive powers of z — z 
The term a_i, formed by means of eq. (2b), is especif 
It is 


a- x = 


1 

2m 




As to the contour C 
it includes the point 
analytic at z 0 , a__i = 0. 
is useful, therefore, wh< 
lar point of f(z). The 
called the residue of the 
z 0y and eq. (3) is know: 
of residues. 

If f(z) has a numfc 
ties Zo, zi, z 2 , etc., w: 
C, this path may be < 
manner shown in Fig. 


circles, lies in a region free from singularities, it contributes nothing 
dz. The remainder comes from the singularities and is equal to the 
E their residues. Hence 

jf/W* = 27 ri (sum of residues within C ). 


unple. To evaluate the integral 


r~ 

%) — *r ® 


d<p 


-+■ b cos 0 + c 3 in <f> 


* e**; 0 =* — i log z, d<t> = — i(dz/z ). Then cos 0 = ^(z -|-z *), sin 0 

a? — z”* 1 ). 

dz 




az + £ (z 2 + 1) + ^ (z 2 - 1) 


tour being the unit circle about 0. The denominator of the integrand may be 
§(6 — ic)z* + oz -f ^(6 + ic) = |(6 — ic) £z — ^ — ^ (—a + jR)J 

x [-rb<— ^ 


id we put 


2 — 6 2 — c 2 > 0 then 


Va 2 - b 2 - c* = if 


— r (- a + if) 


6 — ic 


< 1 


her root > 1 and lies outside the unit circle. The residue of the integrand at 
■a + R)/(b — ic) is 

i j_ 

i (6 “ ic) [rbi (_a + fl) “ b^Tc (_a " s) ] 


>re 


/ - -i • 2iri(a 2 - 6 2 - c 2 r 1/s 

V a 2 — 6 2 — c 2 


(. Gamma Function. — The gamma function is a generalization of 


3.2 


SPECIAL FUNCTIONS 


tion, due to Euler, states 


1 . 2 • 3 • • • (n - 1) 

T(z) « lim — — — A: — n 9 


z(z + 1) • * • (z + n — 1) 

Several important properties of the r-function follow at once froir 
definition. Since from (4) 

1 ■ 2 • • • (n - 1) 


T(z + 1) = lim 


(z + l)(z + 2) ♦ • • (z + n) 


n 


•+1 


r(z + i) 


> lim 

n — *» co 


zn 


1 • 2 • • • (n - 1) 


(z + n) z(z + 1) • • * {z + n - 1) 
On the other hand, (4) also shows that 


n* = zr(z) 


ni 


r(l) = lim — - 1 

n-*-co n ! 


From (5) and (6) it is at once apparent that, if n is a positive ini 

T(n) = (n — 1)! 

as was stated above. It is also evident from the definition (4) that 
becomes infinite at z =0, — 1, —2, etc., and that it is an analytic fur 
everywhere else. 

It is often useful to represent T(z) by means of a definite integral, 
achieve this, we consider the function 

FM “ r (' - 


wherein n stands for a positive integer, and the real part of z is taken 
greater than zero in order to insure convergence of the integral, 
transformation r = t/n converts F into 


F(z,n) = n* £ (1 - r) 


The integral appearing here may be evaluated by repeated partial 
grations: 

f ( 1 - r) n r I—1 (ir = T(1 - r) n -"| + - f\ 1 - T) n “Vdr 
J o L z Jo z J o 

The integrated part here vanishes at both limits, and the remainder 
again be subjected to a partial integration, yielding 



integrated part is again zero. By continuing this process we find 


n) 


n(n — 1) 


z{z + 1) • • • (z + n — 1) 


5 f - 

Jo 


1-2 


• re 


z(z + l)---(z+n) 


n 


i approaches infinity, this expressio^becomes identical with (4); hence 

lim F(z,n) = T(z) (3-9) 


the other hand, since e = lim (1 + 1 /p) p and therefore 

P —*• «o 

lim (1 + 1 /p) px = lim (1 + x/n) n 

px — ► «> n — ► oo 

quantity (1 — 0i) n appearing in (8) approaches the limit e -< . We 
jlude, therefore, that in view of (8) and (9) 


/' 


e-'tr'dt = T(z) 


(3-10) 


8 result is valid, we recall, when the real part of z is greater than zero. 
A. definition of the T-function, or rather its reciprocal, by means of an 
lite product has been given by Wderstrass. Since it is a useful one, we 
11 here derive it by simple steps (the rigor of which is not always obvious) 
n Euler’s definition (4). We first note that the product 

11 2 re — 1 

z z + 1 z + 2 z + re — 1 

j n — 1 

eh appears in (4), may be written - II (1 + z/m) 1 , so that (4) be- 

Z m — 1 

iee 

r(z) =- Urn re*n(l+-) 
v z i \ m/ 


4-=zlim re~*n(l + -) 

r(z) n— X\ *»/ 

iro multiply the right-hand side of this equation by unity in the form of 
T lim eC-— I lim He-*'""! 

Ju*-*-® 1 J 

obtain 

r5> * "IS 1 . ?( J + .-)H 



3.2 


SPECIAL FUNCTIONS 


Now the infinite series: lim (1 + \ + • • • 1 /n — log n) = C < 


it has the value C = 0.5772 • • • , known as the Evler-M ascheroni 
Hence 


0) i \ n) 


which is the Weierstrass definition. It shows, again, that T (z) 
at z =0, —1, —2, etc. 

A further important property of T-functions, namely the rel 


r(z)r(i - z) 


is readily derived from the Weierstrass definition. First, we 
theorem: 


■H-Sl 


which may be proved by an expansion of the infinite product aj 
powers of z 2 . (The details are left as an exercise for the readei 


r(*)r(-s) ^ ft (l + * (l - 1)~' 


the last step because of (13). But in view of (5) 

r(- 2 ) = - - T(1 - a) 

Z 

and this, when inserted in (14), yields (12). 

Several other formulas for the derivation of which the rea< 
refer to mathematical treatises, 1 will now be listed without prooi 

r(z)r(z + 1) = 2 1 “ 2 V /2 r(2z) 

An infinite product of the form 

1— a 2— a 3— a 
1—6 ‘ 2~b # 3^6 

may be expressed in terms of T-f unctions: 

rfi-n 


97 


GAMMA FUNCTION 


3.2 


Also, 


n 


n(a + b + n) T(a + l)r(6 + 1) 


i ( a + n)(b + n ) r(a + b + 1) 

If m. and n are positive constants, not necessarily integral, we have 


(3-16a) 


. 


cos m 1 x sin n 1 xdx = 


_ m\ fn 

r ( 2 -) r U 


( m + n\ 
2 / 


(3-17) 


This relation may be modified as follows. Put m = 2r, n = 2s, and 
introduce the new variable of integration cos 2 x = m on the left. The 
integral will then be converted into 


x 




u ’ 


(1 — w)®” 1 du 


which is a function of r and $ known as the Eulerian integral of the first kind , 
or simply the ^-function, and denoted by B(r,s). Eq. (17) may therefore 
be put in the form 


JS(r,s) = 


r(r)r(s) 
r(r + s) 


(3-17') 


The logarithmic derivative of the T-function is given by 


i * m - 




(3-18) 


if x = real part of z > 0, as was shown by Gauss. 

From this result it is possible to obtain an expression for In T (z) which 
is useful in evaluating T(z) for large values of z: 

In T(z) = (z — ^)lnz — z + ^ln (2 tt) + 0 (^ (3-19) 


where 0(1/*) represents a series of terms which vanish for large z at least as 
strongly as 1/x. For real z, (19) takes the form of Stirling’s series, when 
written for T instead of its logarithm: 


r(*) * 

z~ x x x ~ 1I2 (2t) 112 



1 

288x 2 


139 

51840a; 3 


571 


2488320a; 4 


(3-20) 


It is valid when x is large. This expansion may be used for the approximate 

titrnlii o 4-i An rvf fo of qIci /vf lovrro mimKoro * 


In concluding, let us compute a few numerical values of the F-funi 
It has already been noted that 

r(o) * oo, r(i) = i, r(2) = 1 , r(3) = 2!, r(4) = 3! et 

If the values of r(x) in the interval 0 < x < 1 are known, T(x) a 
computed for all real positive x by means of (5). r(|) is easily obt; 

fromeq. (10): 



if x 2 is written for t. Hence T(^) — Vt. The same result could 
been obtained by putting z - j in (12). Thus we find: r(J) = 
r(-f) = r(f) = f? r H , T(J) = -^r^, etc. 2 The qualitative beh 
of r(x) is plotted in Fig. 3. 

Problems. 

a. Prove eq. (13) by expanding the infinite product. 

b. Prove eq. (17 ; ) directly. Hint : Express r(r)r($) as a double integ 
accordance with eq. (10). Next, put the two variables of integration, respectively, 
to x 2 and y 2 and then transform to polar coordinates. The radial integral v 
r(r + 5), the remainder JB(r,$). 

c. Show that B(r y s) = J B(r + M) + B(r t s 4- 1). 

3.3. Legendre Polynomials. — Of the solutions of Legendre's equ 
(2.28), the functions denoted by Pi(x) in sec. 2.11 are of greatest ini 
because they remain finite at x = =fcl. In physical problems, the 
ment of Pi is usually the cosine of an angle and has therefore the ] 
— 1 g re 1. Pi is definite at the endpoints of that range; the < 
solutions are not. Hence the present discussion will be restricted t< 
polynomials Pi. We repeat their definition: 

Pi(x) = 

(20 ! Li W ~ 1) i W ~ l)tf- m - 3) ,-4 

2\ll) 2 { 2(21-1 ) 2-4(21 - l)(2l - 3) 

Specifically, 

P 0 = 1, P x = X, P 2 - \ (3x 2 - 1 ), P 3 = % (5x 3 - 3x), P 4 = | (36x 4 - 30x 2 ■ 

An interesting representation of Pi is easily established. Wher 
function 

F(x,y) = (1 - 2 xy + y 2 )~ 112 (3 

is differentiated n times with resnect to v and v is then nut eon 



99 


LEGENDRE POLYNOMIALS 



Fig. 3—3 

MacLaurin series about the value F(x, 0) the result 
, eF(x,y ) I , y 2 d 2 F(x,y)\ 


, y l d l F(x,y)\ 






(3 


F(x,y) = (1 - 2 xy + y 2 )~ 112 = £ P,(*)</* 

1=0 


This relation has meaning only when the right-hand side conve 
Suppose that | x | g 1 which, as pointed out, is the case in most app 
tions. Pi(x) will then also lie between 1 and —1, for the definition < 
shows that every 

Pi( i) = i 


and that | Pi(x) | < 1 for 
j x | < 1. Thus the coef- 
ficients of y l in (24) are never 
greater, in absolute value, than 
1, and the series converges 
when y < 1. 

Theorem (24) is of interest 
in the calculation of the poten- 
tial due to a static distribution 
of electrical charges. In terms 
of Fig. 4, which depicts a dis- 
tribution of charges q\ • * • g 4 
of different magnitudes and 
possibly of different signs, the 
potential at P is 



# *4 

Fig. 3-4 


V 



Qi 


R 


• r r- /r 


On identifying cos 0* with x and ( ri/R ) with y in (24) one obtains 

„ Z q<r\Pi (cos 0,-) 

y -S ' g" - 


With the use of the definition 

Qi ='Lq{r\Pi(cos9 i ) (3 

t 

this result becomes the mvltipole expansion of the potential arising froi 
electric charge distribution: 

<» 

The monopole strength Q 0 is J)?*; the dipole strength Qi is £#<»•,• cc 


and it represents tile component of what is called the dipole moment 
of the charge distribution, in the direction toward P. The quadru- 

i 

pole strength, Q 2 = T,q t rjP 2 (cos di), is a scalar quantity constructible from 

t 

the components of a tensor called the quadrupole moment , 3 and so on. 

Qi is called the strength of the 2 z -pole of the charge distribution. Its 
value depends on the choice of origin. If all charges have the same sign, 
Qi can be made to vanish by a suitable choice of origin. Furthermore, 
Q 2 can be given an especially simple form by choice of origin and axes, etc. 
Similar remarks are true about multipole moments. 

The reader might find it interesting to verify the following statements. 

(1) Two equal charges of opposite sign produce a dipole moment which 
is independent of the choice of origin. Their quadrupole moment can be 
made to vanish by taking the origin midway between them, in which case 
all Q with even subscripts vanish. 

(2) Four equal charges disposed • with alternating signs about the 
corners of a parallelogram produce a zero dipole moment and hence a 
vanishing Qi; the quadrupole moment for a given orientation of axes is 
finite and independent of the choice of origin. Q 2 depends on the angle 
of orientation. 

(3 ) A continuous spherical distribution of charge has a finite quadrupole- 
moment tensor, but vanishing Q 2 . 


The entire analysis leading to eq. (25) presupposes, of course, that every 
charge is closer to the origin 
than the point P, since the 
requirement?/ = (r t /P) < 1 
must be obeyed. 

From the foregoing re- 
sults one can derive a use- 
ful expression for Pj . Let r 
be a vector extending from 
the origin, A z an increment 
in the 2 -direction, and 
R = r + Az. (Cf. Fig. 5.) 

If then we express 



^ = [r 2 + (Az) 2 + 2rAz cos 0] 1/2 = - 1 + 
R r 


2A z 


cos 0 + 


(tTF 


3 If we label the Cartesian components of u by r[ = a:*, r\ = r\ — z t -, the tensor 

in question is Ti m = In the physical literature the terminology is sometimes 

i 

confused, the multipole strength being identified with the multipole moment. It is 



3.3 


SPECIAL FUNCTIONS 


by means of (24), putting —cos d = x and A z/r = y, we have 


1 1 * 

- =-L Pi(-cose) 

U T im 0 


( t ) 


On the other hand, if l/R = 1/| r + Az | is expanded in a Taylor s< 
about R = r the result is 


1 1 , , 3 /1\ , , (A z) 1 d l /1\ 

R = ~r + AZ fz\r) + "- + ~r^\r) 


On comparing the coefficients of (A z) 1 in (27) and (28) it is seen that 


3+1 Pj(— cos d) 


V. dz l \r ) 


Since Pi (— x) = (—l) l Pi(x), this is equivalent to 




Pi (cos 6) = -r— — r 


*( 1 \ 
dz l \r ) 


In using this relation it is understood that cos 0 = z/r , and r 
z 2 + y 2 + z 2 - 

Another expression for Pi involving an l - th derivative, and in some s< 
simpler than (29), is known as Rodrigues 1 formula , To obtain it we obse: 
first of all, that 


(^-D'-St-D* - *' ;„ * 2 


X!(Z - X)! 


in accordance with the binomial theorem. When this expression is diffe] 
tiated l times, there results: 


3 (* 2 - 0* = Z(-l)* 


(2Z - 2X)! ,_ 2 


X!(Z - X)! (I - 2X)f 


the summation extending over all integers X including 0 until X eqi 
either or ^ (Z — 1). The right-hand side of this equation may be writ 


iHI'L. Jff-i) j-2 




Hence, from the definition of Pi } eq. (22), 




(3 


J 


LEGENDRE POLYNOMIALS 


3.3 


mtion is differentiated l times with respect to the argument of /(a), the 
ult may be written 



» <£ m 

2riJ ( 2 — x) i+1 


dz 


>w choose /(a:) to be (x 2 — l) 1 , so that 


dx l 


_ iy . JL (fl - V 


(x 2 - 1) 


2 iriJ (z — x) !+1 


dz 


l comparing this with (30), it is seen that 


Pt(*) 


Z~ l £ (* 2 - l ) 1 

2x iJ ( z — x) l+l 


dz 


(3-31) 


The path of integration here is understood to be some contour enclosing 
& point x in a counter-clockwise sense. From this result, which is known 
Schlaefii’s formula, it is possible to derive the formula: 


Pi(x) . - f [x + Vx 2 - 1 cos <p] l d<p (3-32) 

X J 0 

> do this, one may take the contour to be a circle of radius V | x 2 — 1 |, 
that z in (31) is x + Vx 2 — 1 e i<p and <p varies from —x to +x. The 
iegral then becomes 

(,) . r: r »*• - 1 + 2 iv gE I < .‘* + <*• - 

^ } 2-tdJ _* [V^Te^] m 

= — / [a; + Vx 2 — 1 cos <p] l d<fr 

2 x J 




This result is equivalent to (32) because the integral from —x to zero is 
ual to that from zero to x. 4 


/ 0 pa 

I{x)dx = I I(x)dx if the integrand is an 
-« Jo 

&n function of x, that is, if /(— x) — I (x). To see this, change the variable of inte- 
ntion to —x, and make a corresponding change in the limits: 

X o po po pa 

I(x)dx = — I I(~x)dx =* — I I(x)dx » I I(x)dx 

■a Ja i/o JO 


I (x) is odd, that is if /(— x) — — T’(x), 


3.4. Integral Properties of Legendre Polynomials. — Integrals 
products of Legendre polynomials, which are needed in many quan 
mechanical problems, are best obtained with the use of Rodrigues’ forn 
We wish to calculate 

J Pi(x)Pi,(x)dz 

First, we suppose that l' > l. Substituting in accordance with eq. 
this integral becomes 

2 l+v l\l'\ f_ i d? ^ ~ 1)1 ‘ d?' ^ ~ 1)1 dx (3 ' 

After l f successive partial integrations, in which all the integrated p 
vanish because every derivative of ( x 2 — l) z is zero at x = dtl, the reir 
ing integral reads 

2*WW dJ+t' ^ ~ 1)1 ' ( * 2 “ 1)l ' dx (3 

But the (l + derivative of (x 2 — l) z is certainly zero because 

highest power of x in (x 2 — 1)* is x 21 , and l + l r is, by hypothesis, gre 
than 21. Therefore the integral vanishes. This is clearly true when 
V 9^ l ; for if l should be greater than l r we need only “ unpeel ” the dei 
tives appearing in (33) in the reverse manner by partial integrations, 
we are left with an expression like (34) but with l and l r interchanged. 
Next, suppose that l = l\ The integral in (34) then reads 

/ I j 2 l 

(x 2 ~ l) 1 (* 2 l) 1 * = (22) ! J (x 2 - 1 ) l dx 

the latter because the only term of {x 2 — 1)* which will not vanish aft< 
differentiations is x 21 . But on putting x = cos 6 it is seen that 

Collecting coefficients, we find in place of (34) 

(-Q* (2lv (-Q* • 2 w n _ 2 

2 2l (l\) 2 1 ' Z ■ 5 • (21 + 1) 21 + 1 

Our results may be combined in the formula 

j\{x)P v {x)dx = (3- 

The svmbol 5? it here emnloved is freelv used in mathematics and nhv 


105 RECURRENCE RELATIONS BETWEEN LEGENDRE POLYNOMIALS 3 . 5 ' 


it is called the “ Kronecker ” 8 and represents a discontinuous factor which 
is taken to be unity when the two subscripts have the same value (l = V) 
but is zero when they are not equal. 

3.5. Recurrence Relations between Legendre Polynomials. — Rela- 
tions between Legendre polynomials are most simply derived from Schlae- 
fli’s representation of Pi(x), eq. (31). We first observe that 




z(z 2 - 1) ; 


(z - X ) l 


(z - X ) 


l+l 


dz 


(* 2 - D* 

(z - *y +i 


J te- 


ds (3-36) 


The first term on the right of this equation, however, may be transformed as 
follows. Since 

l ) 1 (* 2 - l) m l 


d /z 2 - iy +i r (z 2 - 

jzbr= i) =^ + 1 )L 22 (^ 


x) 


i+i 


(z - x) 


1+2 


and since the integral of the left-hand side around a closed contour must 
vanish, we find 


/ 


z(z 2 - 1)' 
(z - 


dz = | 


£ (z 2 ~ l) w 

J (z - x) l+2 


dz 


Equation (36) thus reads 
(z 2 - l) 1 




(z 2 - l) l+1 
(z - x) l+2 


dz — x 


/ 


(z 2 - 1) 
(z - x) l+1 


Reference to equation (31) allows the two terms on the right to be identi- 
fied, after multiplication by (2 ,+1 «) -1 , with P 2+1 (x) — xPi(x). Hence 


2r l r (z 2 — lV 

/•„(«) - *P,M - jj/ * (3-37) 


When this is differentiated with respect to x, there results 

P'i+ i(x) - xP[{x) = (Z + 1 )Pi(x) (3-38) 


which is the first important relation to be derived. It connects Legendre 
functions, and their derivatives, of degrees l and l + l. 

A relation connecting Legendre polynomials of three different degrees 
may be deduced in a similar way. Clearly, since 


r d r zfz 2 1) 
J dz _ (z — x) 1 


'] 


dz 


= 0 




0 


3.6 


SPECIAL FUNCTIONS 


106 


z = (z — x) + x, so that 
(z 2 - iy 


G + 


1 >/ 


(2 - *)* 


dz -J- 2Z 




(2* - l) 1 


(*“*) 


(2 - *) 


l+l 


dz — 0 


Now the first term appearing here may be identified by means of (37), the 
others by (31). Thus, after simple rearrangement, 


(l + 1 )Pi+i(x) - (21 + 1 )xP t (x) + IPu- i(x) - 0 (3-39) 

The remaining relations are derived by differentiation and elimination 
among formulas (38) and (39). Thus, when (39) is differentiated with 
respect to x and P( + i is eliminated by means of (38), 

xP[(x) - i(x) = lPi(x) (3-40) 


Finally, the reader will have no difficulty in proving, by eliminations among 
(38), (39), and (40), that 

P'i+i(x) - PL i(x) = (21 + 1 )Pi(x) (3-41) 

and 

(x 2 - 1 )P[(x) = lxPi(x) - IPi-Lx) (3-42) 


It may be remarked that the recurrence relations here derived are also 
correct for Legendre functions having non-integral indices Z, although the 
above proof does not indicate this fact explicitly. 

3.6. Associated Legendre Polynomials. — The associated Legendre 
polynomial has been defined in sec. 2.11 as 

dP 

FT (X) = (1 - x 2 r 12 — P l (x) (3-43) 


This definition is meaningless except when m is an integer not smaller than 
zero. In the present discussion, this will always be understood to be the 
case. 

Recurrence Relations. We first derive the more important recurrence 
relations between these functions, which, as was shown in 2.11, satisfy the 
differential equation 




FT =0 


The function P[ m) = (<F l /dx m )Pi was seen to be a solution of 


d 2 


■ “ r>(m) 



*\jn -r I) —y==s= rj ■+■ -r ±) — ~r i;j ^ = u 

vi - ar 

on replacing m by m — 1 

7 +1 - 2m * — Pf + [Z(Z + 1) - rn(m - 1)]PT _1 = 0 (3-45) 

vl-x 2 

s represents the fundamental relation between three associated Legendre 
stions with equal l but consecutive values of m. 

To get a similar relation for equal m but consecutive l we return to 
, (39) and (41). Differentiating the first of these m times we have 

h 1)P&\ - (21 + 1) xP\ m) - (21 + l)mPi m ~ 1) + lPiZ\ - 0 (3-46) 

en (41) is differentiated m — 1 times, the result is 

P\f x - Pt\ « (21 + 1 )Pf*- 1) (3-47) 

eliminating between (46) and (47) we find 

(21 + 1 )xP[ m) = (l + m)Pa + (l- m + l)Pft\ 
en this is multiplied by (1 — x 2 ) m/2 , there results the desired relation: 
xPT = (21 + 1 )~ l [(l + m)PJL i + (I - m + 1)PT +1 ] (3-48) 

Two “ mixed ” relations, in which both l and m have different values, 
often useful (cf. Chapter 11) and will now' be derived. One is at once 
ained from (47) when that equation is written with m replaced by m + 1 
l then multiplied by (1 — x 2 ) (m+1)/2 : 

(1 - a?) 1!2 PT = (21 + irMPtft 1 - PT+ l 1 ] (3-49) 

e other can be deduced from eq. (45). When xPJ” in (45) is eliminated 
means of (48) it reads: 

o«j 

pr+i - a - + 1 ) i(! + m)pr -' + 9 ~ “ + i)pr+j 

- [((! + 1) - mlm - DIPT 1 

re Pf~ l can be expressed in terms of PJ+i and P? Lj by means of (49) 
ritten for m — 1 instead of m). When this is done and the terms are 
lected, we find 

- x 2 ) 1/2 PT +1 « (2 1 + 1 rMa + m)(l + m+ 1 )P!t 1 

— (I — m)(l — m + l)Pi+i] (3-50a) 

r convenience in later work this may be written in a form similar to (49) 


3£ 


SPECIAL FUNCTIONS 


if only m is replaced by m — 1. Thus 

(1 - s 2 ) 1 ' 2 ^ = (2 1 + l)- l ia + rn){l + m- lJiT-l 1 

- (I - m + 1 )(l - m + 2)P^7 1 ] (3- 

It is seen from (49) and (50) that Vi — a: 2 Pf can be expressed in tea 
of Ff-i and PT+i, as well as PT~i and Pj+7 1 - Relations (48), (49), i 
(50) are used in calculating quantum mechanical matrix elements 
central field problems (cf. sec. 11.13). 

Integral Properties. It is desired to evaluate the integral over 
product of two associated Legendre functions having the same index 
If we use the definition (43) together with Rodrigues' formula (30) we h 

pi r l\m pi jl-bn j J'-hn 


where X = x 2 — 1. As was done in connection with the Legendre p< 
nomials, we again carry out a sequence of partial integrations, l' + v 
number, in which all the integrated parts are zero. The integral in ( 
then reads 




(-D 


/: 


d v+m 

dx v+n 



d l+m 



X l 'dx 




d l ’+^- x 

dx z '‘ hmr_x 


(X m ) 


gl-bn+l, 

dx l+m+x (X ‘ 


Now the term of highest power in X m is x 2m , that in X 1 is x 21 . There] 
every term in the summation over X will be zero unless, simultaneously, 

V + m — X ^ 2m and l + m + X S 2Z (3- 

The first of these implies: X ^ V — m, the second X ^ Z — m. Lei 
suppose that l <l\ Since m is positive, these two relations are inc< 
patible, and the summation contains no term which is different from z 
Hence the integral (51) vanishes. If l > l' y it must also be zero bees 
the integrand is perfectly symmetrical with respect to l and Z'. To si 
this result explicitly, the partial integrations must be performed in 
reverse manner. 

If l' =*= l the two relations (52) are indeed compatible, but only for 
single value X = l — m. Hence the sum over X contains only one te 
and the integral becomes 

f X ‘ (;+:)£ <jh (x-j* 


ADDITION THEOREM FOR LEGENDRE POLYNOMIALS 


3.7 


(l + m\ _ 

’ \l - m) ~ 


Thus we find, collect- 


e remaining integral has already been computed (preceding eq. 35); 
fas found to be 

(— l)*2 2W ffl) 2 
(21 + 1 ) ! 

the other hand, \ . Thus we find, collect- 

\l — m) (l — m) \ (2m ) ! ’ 

the various factors, 

- (-l)" . q + w)l(-l) ,+ ” , 2 , )l(2m)l (-l)*2 2m (Z!) 2 

J /(*)]<& 2 2I (Z!)2 ( Z _ OT )j( 2m )i ( (21 + 1) ! 

(I -f- TYl) ! 2 

= (l - m)\ * 2Z + 1 

las thus been proved that 

C - f^yr aTT 8,, ‘ (3 “ 53) 

: is taken to be cos 0, this result may be written in the equivalent form: 

f v (l 4- mV 2 

T*m / /»\ rwn / /%\ _• v t/ « ) * A . /o co.\ 


[PT(x)] 2 dx = • (2Z)! (2m)! • 


(_ i)*2 zm (Z!) 2 

(2i + 1) ! 


/> 


(x)FV(x)dx - 


(3-53) 


Pr (cos 0)PJ? (cos 0) sin 9d9 
o 


(Z - m)! 2Z+ 1 


Si'i (3— 53a) 


3.7. Addition Theorem for Legendre Polynomials. — To prove the 
lous addition theorem for Legendre polynomials (eq. 61) it is necessary 
b to establish a formula due to Heine. If we substitute the Schlaefli 
)gral, eq. (31), for Pi in the definition of PT (eq. 43) and carryout the 
erentiations with respect to x under the integral sign, we have 

[x) = (I + 1)(Z + 2) • • • (l + m)(l - x 2 ) ml2 X 

2wi 


f(z?~ 1)‘(2 - x)~ l ~ mr ~ 1 dz 


n let z = x+ n 2 — 1 e i<f> and integrate over <£ from — 7r to tt in accord- 
e with the meaning of the contour (cf. eq. 31 et seq.). Then 

rw - c + Dg + »-(i + -) (1 . ^ x 

r* [x + Vx 2 — l cos $ 


(1 - x 2 ) m ' 2 X 


[Vx 2 - 1 e*]" 


(l + 1 )(l + 2 ) • • • (I + m) 


(-I)* 12 X 


3.7 


SPECIAL FUNCTIONS 


110 


PT(x) m (i ±m± 2) -g + m)(-ir /a x 


t r 


[® + V^TTi cos $]* cos (3-54a) 

o 


In taking the last step we observe that, of the two constituents of 
e im<t> _ CQg ^ s j n the first is an even, the second an odd function 
of <j>. Since the other remaining factor of the integrand is even, only the 
cosine part of e %m<i> will give a finite integral, and this has twice the value of 
the integral between the limits 0 and x. Eq. (54a) is Heine's formula. 

If in the differential equation for P?{x) we substitute — l — 1 for Z, the 
equation remains unaltered. Therefore P™i-i - PT • Ibl view of this, 
Heine's formula may also be written 

Pf (l) - K,., - <-<>(-! + 1> "■(-!-! + ">)(-! g x 

X 

J [x + V x 2 — 1 cos cos rrupiQ 
o 


1(1 — l) • • • (l — m + 1) (-- l) 3m/2 x 

x 



cos m4xi<t) 

[x + Vs 2 — 1 cos (j>] l+l 


(3-54b) 


To prove the addition theorem we consider the equation 

T + M z 1 ^ 1/2 CQS fa z 

z-o^ [x 2 + (a| — 1) 1/2 cos a] 1-1-1 

= {^2 + (%2 — I) 1/2 COS a — p[xi + (x\ — 1) 1/2 cos (a? — a)]}”* 1 (3-56) 

which is an identity for sufficiently small values of the parameter p. All 
other quantities appearing in (55) are supposed to be real, but otherwise 
unrestricted. This relation is simply an application of the expansion 

= (1 - x)~ l , | x | < 1 

i* 0 


Let us integrate eq. (55) over da from — x to x. The integral on the right 
may b© evaluated by means of the formula 


Ill 


ADDITION THEOBEM FOE LEGENDRE POLYNOMIALS 


3.7 


which was proved as an example on page 93. Here 
a = x 2 — px i 

b = (x 2 — 1) 1/2 — p(x\ — 1) 1/2 COS 0 ) 
c = — p(xi — 1) 1/2 sin a) 

Hence the right-hand side of (55) becomes after integration 

2t{ 1 — 2 p[XiXo — (xi ~ l) lf2 (xl — 1) 1/2 cos a>] +• p 2 }~" 1/2 

As will be seen forthwith, the expression in [ ] appealing here has a very 
simple geometrical meaning. For the present, let us designate it by x: 

x = xix 2 — (x{ — l) ll2 (x\ ~ 1) 1/2 cos co (3-56) 

The result of the integration may therefore be written 

27r(l — 2 px + p 2 )“ 1/2 

But by the theorem on Legendre functions, eq. (24), this is 


2 *- i :v l Pi(x) 


1=0 


The left hand side of (55) may be integrated term by term because the 
expression is assumed to converge. On comparing coefficients of p l we 
see, therefore, that 


Pi(x) 


1 r fci + (X 1 - 1) 1/2 cos (co - a)] 1 
2 tt «/_ r [z 2 + (x 2 2 - 1) 1/2 cos ** 


(3-57) 


The last step of the proof involves an expansion of Pi{x) in a Fourier 
series. 5 Clearly, Pi(x), being a polynomial of degree l in cos co, can be 
expressed in the form 

i 

Pi(x) = \c 0 + 2 c m cos mo) (3-58) 

m=> 1 


The coefficients c m are given by 


- f r Pi(x) 

7T J 


cos moodo) 


J a J_ 


J fo + (®1 - l) 1/s cos (co - a)] 1 _ 

r i / 2 i \ l /9 t /_ i— i COS TfUx ) 

[Xo + (xi ~ l) 1 ' 2 COS a] l+1 






introduce the variable c*> - a == £ in place of w, so that 

da 


1 r T 

~2t 2 J_Ax, 


a] i+1 [*i + (*? - 1) 1/2 


COS/ 


ff [x 2 + (xl - 1) 1/2 cos a ] 1 

(cos ma cos m /3 — sin ma sin mp)dl 


The integration with respect to p over the term containing sin m/3 is 
ously zero because the integrand is odd. The other term can be eval 
by means of eq. ( 54 a). The result is 


c m 

1 r cos made* - 2 ir(-l )~ ml2 

2x 2 J_,[x 2 + (x\ - 1) 1/2 cos a ] l+1 ‘ (I + 1)(2 + 2) • • • (l + m) P 1 

In the remaining integration over a we use ( 54 b), obtaining 

c "- 2 fT^ Pr(l ‘ )Prw 

Hence from ( 58 ) : 

i (i __ m )i 

Pi(x) - P i (x 1 )P l (x 2 ) + 21 77— —(r PT(x 1 )PT(x 2 ) cos mu ( 

m=i (Z + m) ! 

which is the desired addition theorem. 

Finally, let us investigate the meaning of x defined in eq. ( 56 ' 
0i, <pi and d 2 , <^2 denote, respectively, the polar and azimuthal ang 
two lines passing through the origin, then 0, the angle between thes 
lines, is given by 

cos 0 = cos 6 1 cos 02 + sin 0 X sin 0 2 cos (<pi — (p 2 ) ( 

Thus, if in ( 56 ) the following identifications are made: 

X = COS 0, Xi = COS 0i, X 2 = COS 02, 00 = (p t — (p 2 

then ( 59 ) becomes 

^ /7 — f 

P i (cos 0) = Pz (cos 0i) Pi(cos 6 2 ) + 21 7— rf x 

m = l (Z + m)l 

FT (cos $i) P™( cos 0 2 ) cos m(<?i — <p 2 ) ( 

In quantum mechanics (cf. sec. 11.12) it is convenient to use asso< 
Legendre functions which differ from P?(x) by factors depending on 
m, but constant with respect to x and so chosen that the integral ov< 
square of the functions is unity. These functions are called “ normal 
associated Legendre functions. They will here be denoted by nr* 

us put nr(s) = N,rJT(x). Then if we wish f [nT(cos 0)1 2 sin ( 


be equal to unity we must, in view of (53), put 


"21 + 1 (/ - m)!l 1/2 
.2 (J + m)!_ 


It is also customary to permit the index m of Ilf* to be negative and to 
define 


n 


m 

l 


21 + 1 (l — | m |)f 

. 2 (H-|m|)!_ 


1/2 

P\ m \x) 


(3-62) 


The index m may then take on all integral values including zero from — l 
to +Z, while l is always a positive integer. In terms of the functions Ilf 1 , 
the addition theorem takes a particularly simple form: 


2Z —j— 1 


n?(cos ©) 


l 

= £ nr(cos^ L )nr(costf 2 )cosm( V i -^ 2 ) 

m = — l 


(3-Gla) 


One may also replace the factor cos m(<pi — <p 2 ) in each term of this summa- 
tion by ^ because each pair of terms corresponding to +m and ~m 

then yields a cosine function. 


Problem: Express eqs. (48) to (50) in terms of H-f unctions. 


3.8. Bessel Functions. — In sec. 2.14 we have shown that a particular 
solution of Bessel’s differential equation, (eq. 2-57), is the “ Bessel function 
of order n,” defined as (cf. eq. 2-60) 


T( , f (~D X /A n+2X 

x =0 r(X + l)r(X + n + 1) W 


(3-63) 


It is of interest, first, to note that for integral n, J n (x) is the coefficient of u n 
in the expansion of exp [(.r/2 )(u — 1 /u)\. In fact J n (x) for integral n, 
called Bessel’s coefficient, was originally defined by means of this relation. 
To prove it we merely expand the exponential, using the binomial theorem 6 
to express (u — 1 /u) v : 




If we no v put v — 2\ = n, this becomes 


exp 


\x( i\i _ B r - (~D X M n+2X “| 
_ 2 \ «/J Lx?o(« + X)!X! \ 2 / J 


(3-64) 



For integral n, the bracket appearing here is identical with the expans 
(63) ; hence the above-mentioned theorem is proved. 

From (64) an integral representation may be derived quite simp 
By the theorem of residues (eq. 3) the coefficient of z~ l in an expansior 
f(z) is given by 

a -'~hf Iiz)dz 


the integral being taken in a counter-clockwise sense about z = 0. Sii 
larly, the coefficient of z n in the expansion of f(z) will be: 


l £ /CO , 


The theorem just proved is therefore tantamount to the relation 

J n (x) = 2- j> vr n - l e (xl2)(u - llu) du (3-i 


It is customary to write this result in a slightly different form, obtaina 
on replacement of the variable u by 2 t/x. Eq. (65) then reads 


Jn(x) = 

2m 



j — n — 1 



(3— ( 


While this integral has been shown here to be identical with the converge 
sum of eq. (63) only if n is an integer, a more special considerate 
would indeed establish the equivalence of (63) and (66) for non-integ 
n also. A simple way to prove this fact is to show, by substitution, tl 
(66) satisfies Bessel’s differential equation. On performing the different 
tions indicated on the left of eq. (2 -57) 7 and substituting therein, we find 


2 n+1 7T?- 
,*, 71+2 


2 n+1 7Tf 


1 (< - j)] <« - o 


dt 


because the integral around a closed loop of an exact differential is ze: 
We may therefore regard either (63) or (66) as a definition of J n (x ) for bo 
integral and non-integral values of n. For non-integral n, however, cauti 
is required in the choice of the contour of integration in (66). This mr 
clearly enclose the origin. But if we were to take, for example, a cir< 
about the origin as center we should encounter a difficulty. For non-ini 

7 The differentiations may be carried out in (66) without regard to the fixed path 
integration, that is, “ under the integral sign.” 


115 


BESSEL FUNCTIONS 


3.8 


gral n the integrand is a many-valued function of t. Thus if the amplitude 
of t should vary, say, from —t to +7r, the integrand will not have performed 
a closed loop 8 and the last equation above would not be true. It is neces- 
sary, therefore, to select a path of integration which (a) encloses the origin 
and no other singularity of the integrand; (b) starts and ends at a point in 
the H>lane which will cause the integrand to perform a closed loop also. 



Fig. 3-6 


Such a path is that illustrated in Fig. 6. Whenever eq. (65) or (66) is used, 
we shall understand that the contour integral is taken along this path. 
(The reader familiar with the theory of many-valued functions will observe 
that this path confines the integrand entirely to one of its branches pro- 
vided that the argument of t is given its principal value.) 

Recurrence Relations. From (66) one may show by direct differentia- 
tion that 

f[x- n J n (x)) = -x- n Jn+ i(z) (3-67) 

ax 

or, when the differentiation on the left is carried out, 

J’n(x) = - J n (x) - J n+1 {x) (3-68) 

x 

To obtain the other fundamental recurrence formula we perform the 
differentiations in the equation 

f f [ r- exp (* - s)] * ■ ° 

The result is 



8 As an example of many-valued functions, consider *= ~ p lf 2 e i$l2 . When 



3.8 


SPECIAL FUNCTIONS 


When use is now made of the definition (66) this reads 


2»t(!) Uv- 1 + Jn+l- x Jn \ 0 


hence 


zn T / \ 

•/„__! ( X ) + Jn+ri*) = — J *( X > 


On eliminating frem (68) by means of (69) there remits 




and from this and (68) 

J '„( x ) = 2 ^ "- 1 ( x ) ~ 

Ws /nlejrol. Let US consider J.« a. defined by (65): 

1 — l_as/2(u 


- j> u _n_ V 


/ , . _ m to the right below the real axis (u = <e , ® > 

start at , P f m a circle of unit radius m a counter 

up to the point , P <e < T ) and finally to re 

wise sense about the or gi f ’ < t < + « ). The contour i 

— oo above the real axis (u — te , + 

then becomes 


DUK5II UCtuuiw . 

AM -S e " + ” , 'X" r "‘“ p i\ tJr ‘) dt + 


if 

2ir «/— , 


—nta+tx8in&^ __ 6 (n+l)i» 

^ 27rt 


/*+» £ / 
ir J +1 rn ~ 1 exp 2 \ 


The second of these integrals may be written 


- f cos (n0 — x sin 6)d6 

7 r t/0 


because the odd part of the exponential , sin («* - I sin 0), vanishes 
graticm between^— a and +,. The first and last may be transfo 
putting t = e 6 and noting that 

e d — e~ e = 2 sinh 0 


When they are combined, the result is 




117 


BESSEL FUNCTIONS 


3.i 


Hence 


J n (x) = ~ eos (n$ — z sin 6)dfi — 


sin tit 


J exp ( — tiB — x sinh 6)d0 1 (3-72 


This is a generalized form of Bessel 1 s integral , derived by Bessel for integra 
values of n. In that special case the second integral vanishes and 


K(X) = - f 

tJq 


cos (nd — x sin 0)<20, n = integer (3-72a 


Bessel Functions of Half -Odd Order. When n is half an odd integei 
for instance, p + J n (x) takes a particularly simple form and is relate 
closely to the trigonometric functions. To show this, let us first comput 
Ji/ 2 (x) by the expansion (63). We may then use the recurrence formula 
to obtain J 3/2 , etc. Thus, 

, 1/2 (~l) x x 2X 


J 1/2 (x) = ( 2 


? 2 2 X X!r(| + \) 


But in view of eq. (5), etc., [r(x + 1) = xr(x)] 

2X + 1 2X - 1 


r(f + X) = 


fr(f) 


_(a+j)i_ _ (a + 1)1 ^ 

- 2 2» X | r W - 2 2X+1 X | 


When this is substituted in the series for Ji/ 2 ( x )> there results 

L 1/2 2( / o \ 1/2 / 1 "sX^X+l 


A/2(a:) = U 


x (2X + 1) 


x x 2X /2\ 1/2 (-l)V 

! Vir Vr*/ x (2X + 1) 1 

/ 2\ 112 . 

— ( — J sin x 

W 


(3-73 


From (67), 


d 


Jz , ,(*)- -x 1/2 ^[x- 1/2 J 1/2 (x)] 


( 2 \ 1/2 /sin x \ 

U (— - cos *) 


This process may be continued if the explicit' form of the functions J p + 1 / 2 (- 
is desired. A general formula is readily obtainable as follows. Eq. (6< 
may be written 


or, by repeated application of (67) 


x~*-»j n + P (* *) = (-*)* 

Hence, on putting n = 

d p 

J P+ i,a(*) = x p+l/2 ( — 2 ) p ^^2^ 1 / 2 J" 1 7 2 (^)] = 

( — l) p (2x) p+1/2 d p / sin x\ 
*- 1/2 d(x 2 ) p \ x ) 

The first few functions of half-odd order are given below. 


( 3 - 


(3- 


V 

\Jy Jp+ 1/2 W 


0 

sin x 


cos X 

1 

sin x 


cos X 

— 

COS X 

—sin x 


X 


x 


/3 , 

\ . 3 

3 . / 3 \ 

2 


1 sin x cos x 

-smr+i-l I cos x 


\X 2 

J X 

x \x 2 / 


/15 

3\ . /15 \ 

/15 \ . /15 6\ 

3 

- 3 - ‘ 

- J sm x — I — 7 — 1 J cos x 

V* 2 “ V s,n x ~\T 3 ~l) coa 


\x 6 

w \x / 


A05 

45 , \ . 

/105 10\ 

4 


^2+ 1 j Sm " 


I 


/105 10\ 

, /105 45 . \ 



l ) cos X 

\x 3 x J 

+ V?'~'3 + -7 C “* 


When the differentiations in (75) are carried out it is easily establisl 
that the asymptotic form of Jp+j /2 is given, for all p, by 

. 1/2 


-(£)' am(»-p|) (3- 


3 . 9 . Hankel Functions and Summary on Bessel Functions . — 1 
Bessel function J n (x) is only one particular solution of Bessel's differenl 
equation. However, as was noted in sec. 2.14, J— n (x) is also a particu 
solution, for the differential equation is insensitive to the substitution 
—n for n. Hence a general solution of the form 

V = aJ n (x) + bJ„ n (x) ( 3-1 


is hflnH nrnvir?ArS .7 ' fi/nH J _ ata HifTortvnt fim/vfirvno i a. Krtaai*l«r l-n, 



119 HANKEL FUNCTIONS AND SUMMARY ON BESSEL FUNCTIONS 3.9 

pendent. As was also shown in sec. 2.14, this is true as long as n is not an 
integer. The Hankel function, frequently used in physical problems, 9 is of 
interest only in connection with non-integral n and the following remarks 
are restricted to that case. This function is a solution of Bessel's equation 
of the form (77) with the constants n and b suitably chosen. We distin- 
guish two kinds of Hankel function, generally denoted by and 


H™ = — r'J.W - J-n(x)} 
sm mr 

H < 2) = ~~~ [e^J n (x) - J_ n {x)) 
sm mr 

Hence, conversely, 

Jn(x) = \{H^{x) + H™(X)] 

J-n{x) = + e-^H^ix)} 


(3-78) 


(3-79) 


These definitions hold, of course, for complex as well as for real values of the 
argument. Hankel functions are particularly useful for complex argu- 
ments, for they vanish strongly when the modulus of the argument 
approaches infinity, which is a requirement in many physical problems. 

The qualitative properties of Bessel and Hankel functions may be 
summarized in the following brief survey. 


A. J n (z) is real if 2 is real, complex if 2 is complex. 
1. At x = 0, 

1 if n =0 

j (x) _ J 0 if n > 0 

' 0 if n < 0, and n is an integer 

00 if n < 0, and n is not an integer 


2. At all J n {x) oscillate, but with ever-decreasing amp- 

litude (provided x is real). 


lim J n (x) 

X — ► 00 



if n is even 


if n is odd 


9 See, for instance, Stratton, J., “ Electromagnetic Theory,” McGraw-Hill Book Co., 
1941. For applications in: propagation of radio waves, cf. Sommerfeld, A., Ann. der 


B. H n (z) is complex if z is real, but i n+l Hn ) (ix) and i (w+1) jl£ 2) (~ ix) 
are always real if x is real and > 0. 

1. At 2 = 0, both Hn ] and become infinite. In fact 

lim i n+1 ff i 1 * {ix) = lii n t~ (n+1) if^ 2) (-~ix) = - — (~\ 

x — 5*0 x — >0 ^ Vv 

2. At 2 — * oo , either H™ (z) or H™ (z) vanishes exponentially. 

0 if the imaginary part of z > 0 

lim H ( n l \z) = • 

[ oo if the imaginary part of z < 0 
oo if the imaginary part of z > 0 

lim Hf(z) = 

lzl 0 if the imaginary part of z < 0 

The behavior at infinity of both J n and H n is most easily remembered by 
noting the general similarity between 

H™(z) and 

H™{z) and ^ 

J n (z) and \(e xz + e~ xz ) = cos z 

The important difference between the Bessel functions and the circular 
functions is in the fact that the former have neither constant amplitude nor 
constant wave length. 

Useful Formulas Involving Bessel Functions. We conclude the discus- 
sion of Bessel functions by appending here a list of formulas involving 
Bessel functions. Some of these are easily proved with the use of the 
theory here developed; to establish others reference should be made to 
more comprehensive treatises, such as that of Nielsen 10 and that of Gray 
and Mathews. 11 An extensive table of differential equations having Bessel 
functions as solutions is given in Jahnke and Emde. 12 

| Jo CO | S 1, I Jn{x) | ^ for n > 1; x real 
[Jo(z)} 2 + 2±[J n (z)f = 1 

n 

+ J- n+ r(z)J n (z) = 

7TZ 

10 Nielsen, N., “Handbuch der Theorie der Cylinderfunktionen,” Teubner, 1904. 

11 Gray, A., and Mathews, G. B., “ A Treatise on Bessel Functions,” Macmillan. 
London, 1922. 

1 0 t i i % n i -r* li m v i . f . _ ur-ii tt> t •» /"i _ 


121 


HERMITE POLYNOMIALS AND FUNCTIONS 


3.1} 


J (*x -m-t-l 

X m J n+1 (x)J n _ l (x)cb = [JhhMJ^x) - (Jn(x)) 2 ) 

o TO — 1 

4 r x m [j n (x)] 2 dx 

l *SQ 


m + 


provided n is a positive integer and m + 1 > 0 

J n ^{x)H^{x) - J n {x)H^{x) = H (2 l,(x)JJx) - H<?\x)J^(x) 

__ _ 2 _ 

Tix 

r / , ( x 2 \ n " ( — l)\r2 /, , 1 ^2\ X T , . 

Jn(Xl + £ 2 ) = ( 1 H ) £ TI ( 1 + „ I Jn+x(*) 

\ x i / x=o a! \ 2 £ 1 / 

/T»X CO 

I J n == 2 22 (*e) 

Jo x=o 

f x[7„(ox)] 2 (ir = 4 {[/„ (ax)] 2 — J„_i(ax)J n+ i(aa;)} 

Jo ^ 

This formula is also valid when all J J s are replaced by H (1) or i/ (2) . 


f x~ n+m J n (otx)dx = 2~ n+m a" 
Jo 


/ m - 1\ 
r (”-— ) 


if 2n + 1 > m > — 1. 

' 2 \ 1/2 


lim J- 

N 


X 


* w - © 

r X T , , r w 0xJ n (acx)J n —i (0x) - axJ n _i(ax)J n (0x) 

J xJ n (ciX)J n (flx)dx — ^2 

3.10. Hermite Polynomials and Functions. — In sec. 2.15 the Hermite 
polynomial of degree n has been defined as the polynomial solution of 
Hermite 7 s differential equation 

y n — 2 xy' + 2 ny = 0 


(3-80) 



O.IV 


OrJDCiAJU 


Such solutions were seen to exist when n is an integer. Explicitly, y ■■ 


H n (x) = (2x) n - — (2x) n ~ 2 + 


1! 


i(n — l)(n — 2 )(n — 3) 
2! 


( 2x) n “ 4 - 


We shall now find an equivalent expression for in terms of a d 
integral. If we put 


*• - b f ^ 


x)2 dz 


0 


and take the contour around a circle which has the origin as its cente 

dVn 
dx 

and 

<f 4z — " +1 e l5— (z—l)5 dz 
cfar 27ri J 


= — <f 2z~ n e z *~ < - z ~ x)i dz 

2iri J 


The differentiations here may be performed under the integral sign, 
these derivatives are substituted on the left of the differential equatio 
it is found that 

y’n - 2 xy' n + 2 ny n = ± f (4z 2 - 4xz + 2 »c 

= - —.<f 4- {z~ n e x% ~^~ x) *)dz = 0 
2in «/ dz 


The last step follows because the contents of the parenthesis, t 
single-valued function of z, if n is an integer, takes the same value 
initial and final points of the contour integration. It has thus been 
that expression (82) is also a solution of Hermite’s equation. 14 £ 
represents a polynomial in x it must be identical with H n (x) excej 
constant multiplier. This constant may be found by computi: 
example, H n ( 0) from (81) and y n ( 0) from (82) for even n (since otl 
H n (0) would vanish). Eq. (81) gives 


H n ( 0 ) 


(— l) n/2 n! 



14 The function y n defined by (82) is a solution of Hermite’s differential equal 
when n is non-integral, but in that case the contour must be specified differei 
make the integrand return to its original value, the path must start at « , go ii 


123 


HEHMITE POLYNOMIALS AND FUNCTIONS 


3.10 


while from (82) we obtain 



by the theorem of residues (eq. 3). Hence we see that H n is n ! times y n . 

H n (x) = <£ z- n ~ l e xl ~ (z - z)2 dz (3-83) 

2m J 

This result may be expressed in a different way. On examining (82) in the 
light of the theorem of residues it is apparent that y n is the coefficient of 
z n in the expansion of e x2 ~ (z ~ x)2 as a power series in 2 . Hence 

«* *-(-*)* = £ y n z n = E 2 n (3-84) 

n= 0 n U\ 

Recurrence relations between Hermite polynomials of different degree 
are easily derived. The first is implicit in eq. (82a) , which may now be 
written y r n (x) = 2y n ^ x (x) ) or 

H' n (x) = 2 nHn-tiz) (3-85) 

The second follows from the differential equation 

ff"(*) - 2 xH' n + 2nH n = 0 (3-86) 

Others may be derived ad libitum by repeated application of (85) : 

H' n r =4 n(n — l)ff n __ 2 etc. 

Thus far two representations of H n have been obtained, the series form 
(81) and the integral form (83). A third may be deduced from (84). Let 
us take the n-th derivative with respect to z on both sides. The left 
becomes 

e x= L- e — (*— *) 3 = e * 1 (-l)"-^-e-^ )J 
dz n dx n 

and the right simply 

H n (x) + H n+l (x)z + *•• 

These two expressions are equal for all values of z. On putting z = 0, there 
results 

Hn{x) = (-l)V J ^ n 6-* 


(3-87) 



sec. 2.15. It is 


(3-88) 


y = e~ xV2 H n (x) 

and satisfies the differential equation 

y" + (1 - a: 2 + 2 n)y = 0 (3-89) 

The function defined by (88) is called the Hermite (< orthogonal ) function . 
It is of interest because it appears (cf. sec. 11.11) as an eigenfunction in the 
quantum mechanical problem of the simple harmonic oscillator. We shall 
here derive a few integrals involving this function which will be found 
useful later. 

The first is the integral over the product of two Hermite functions, 

J e~*H n (x)H m (x)dx 


In view of (84) 

gtZ— (zr-x) 2 . ^a: 2 — (zz— aO 2 




Hence, multiplying each side by e~* 2 and integrating 


E 

Xp 


[/: 


e~^H\ (x)!!^ (x)dx 


1 &S f° 
J X!yu! J_ 


e x *~~ (3-90) 


The integral on the right has the value 15 V^e 2 * 122 . 
This may be expanded to read 


V^e 2 ^ = £ 


( 2g l Z 2) X _ yT ~ £ X .. 


(3-91) 


where the single summation over X has been changed into a double summa- 
tion over X and ju by the artificial use of the Kronecker 6-symbol, defined 
on page 104- Since (90) is true for every value of z\ and z 2) the individual 
coefficients of every power of z x and z 2 in both expansions must be equal. 
On comparing (91) with the left side of (90) we see that 


/: 


e^'HxH.dx 


X'.fil 


v 2 \ 

■ 1 Xl® 1 " 


or 


J' e^H n (x)H m (x)dx = 2 n n! Vrf,., 


16 In evaluating it, use is made of the formula: 

f“ e-v^^dx = ■s.&e b */ a 


(3-92) 


HERMITE POLYNOMIALS AND FUNCTIONS 




The integral 


f xe-**H n {x)H m {x)dx 

— QO 


can be evaluated in a similar way. In place of (90) we now write 


xe x 2 H\ (x)H tl (x)dx 


i jte = r° 
J \\ul\ J- 


x fZ-( Zl -x)*-(zr-x)i dx 


>i + 2 2 )e 2ziZ2 


The last result is, on expansion, 


. W v 2 h\ +1 zl ^2M22 +1 \ 

l? ~tt + ? ^ry 

.r(~ 2 W *S . , v -M* 

" V H5 or^ryr s '- x - + 5 u 


Equating coefficients of z\z™ then yields 

J xe~* 2 H n (x)H m (x)dx = ’^(2 n ~ 1 nl8 m>n ^i + 2 n {n + l)!5 m>n+ i) 

(3-93) 

The integral vanishes when n = m and also when n and m differ by more 
than unity. The same method may be used to calculate other integrals 
of the type 

f x T e~*H n (x)H m (x)dx 

Later, however, we shall learn of simpler ways, involving matrix algebra, 
for deriving these from the result established in eq. (93). (See problem at 
the end of sec. 11.17.) 

Example. A simple harmonic oscillator, if treated by the methods of 
quantum mechanics, has a distribution of mass about the attracting center 
which is given by 

P«) = ce-*[H n m 2 


[rVirtro £ 


a / ftr R heino- a, nn antitv characteristic of the oscillator, and n is 


3.11 


SPECIAL FUNCTIONS 


vibrating point. The moment of inertia of this mass distrib 
by 

m? = mf° x 2 e-*[H n m 2 dx j 

= y -f~ i~e~ p [H n (?)] 2 d* / f~ e~ p {H n ($) j 

The integral in the denominator has already been calculated 
equal to 2 n n l^/w. The integral in the numerator may be cor 
same method and is found to be 


Hence 


271 + 1 
2~ 


2 n 72l Vx 


mx 2 


m2n + 1 

J 2 


Later (cf. sec. 11.11) it will be shown that 0 = 4? r 2 mv 0 /h so 

2n 4 1 h 


mx 2 = 


2 4l7T 2 Vq 

where v 0 is the “ classical ” frequency of the oscillator. 


LIST OF HERMITE POLYNOMIALS 


H o (6) = 1 

//i (€) = « 

2 tt) = 46 2 - 2 

#3 (6) = S6 3 - 126 

7/4 te) = m A - 486 2 4- 12 

H s (6) = 326 s - 1606 3 4 1206 

i/ 6 (6) = 646 6 - 4806 4 4 7206 2 - 120 

H 7 (6) = 1286 7 - 13446 s 4 33606 3 - 16806 

Ht (6) = 2566 s - 35846 6 4 134406 4 - 134406 2 4 1680 

H% (6) - 51 26 9 - 92166 7 4 483846 s - 806406 3 4 302406 

JBTioC Q = 10246 10 - 230406 8 4 1612806 s - 4032006 4 4 3024006 2 

3.11. Laguerre Polynomials and Functions. — The theor 
polynomials may be developed along lines very similar to the 
section. A Laguerre polynomial L n (x) has been defined in s< 
polynomial solution of Laguerre's differential eq. (2-67): 

xy n + (1 — x)y 4 ny = 0 
It exists whenever n is an integer and was found to be 

/ _ -i\2 


127 


LAGUEBRE POLYNOMIALS AND FUNCTIONS 


3.11 


We first establish a representation of L n in the form of a definite integral. 
Consider 

y ,'£fri ex j^L) dl CM*,, 


where the contour is taken to include the origin. Differentiations with 
respect to x may be performed under the integral sign; hence 


, 1 r z 71 ( —xz 

Vn = (T^? exp vrr 

,, l r z ~ n+l ( ~ xz \ 

u ^J(T^r exp {r^ )' 


On substituting in the left-hand side of (94) we find 

1 Xf xz2 C 1 - x ) z , 1 2_n_1 


i-xr- 

2iri J L ( 


2 iriJ L (1 “ z ) 2 
But this is easily seen to be 


(r^> 


±£i[Sl 

2ti J dz\_ \ — a 


an expression which vanishes because the quantity in brackets takes on the 
same value at the initial and final point of the contour. Hence (96) is a 
solution of Laguerre’s differential equation. Moreover, it is a polynomial, 
as an analysis in the light of the theorem of residues will show. Its rela- 
tion to L n (x) may be established by computing both y n (x) and L n (x ) for 
a particular value of x, say zero. From (95) 

Ln( 0) = n! 

from (96) 




+ 2 + z 2 ) dz =1 


Therefore 


Again, using the theorem of residues (eq. 3), we find, since 


n\ X z ~ n ~* ( ~ 

L -’^rJrr^p(r 


(3-97) 


... W ( ~ xz \ V' n v 1 Ln( - Z 

(1 - 2 ) 1 exp 1 ) = £ y n z n = £ - —r 


(3-98) 


3.11 


SPECIAL FUNCTIONS 


Next, we turn our attention to the recurrence relations existing be 
Laguerre polynomials of different degrees, and between the derivative 
given polynomial. A relation of the latter type follows at once froi 
differential equation : 

xL" + (1 - x)L' n + nL n = 0 (; 

The former relation may be obtained by differentiating (98) with re 
to z : 

l — x — z ( — xz\ ^ L x (x)z x ~ 1 

6xp ‘ £ irriyi 


When the left-hand side of this equation is again expressed in ten 
Laguerre polynomials with the use of (98), the result may be written 


(l — x — z) 


z 


L\ (x)z x 
X! 


(1 - 2z + z 2 ) E 

X 


? x-i 

(X - l)! 2 


On equating the coefficients of z n , there results 




Lyt— 1 
(n - 1)! 


Ln-hl 

n\ 


%L n Ln — 1 

(n - 1)! + Jn - 2)1 


whence, 

(1 + 2 7i x)L n 7i Li n — i L n+ i = 0 (3- 


which is the relation here sought. 

For some purposes it is convenient to have L n in the form of a derivj 
To find it we differentiate (98) n times with respect to z and aften 
put z = 0, thus obtaining 

* is b [ (1 ~ zr ‘ exp (rbjj - LM 


The reader will be able to show without difficulty that 


lim 
z “^° dz 


• exp (rt-) 


Hence 


d n 


Ln ( X ) = (*"« *) 


(3- 


The associated Laguerre polynomial , of degree n — k, was shov 


LAGUERRE POLYNOMIALS AND FUNCTIONS 


3.U 


and is given by 


y = L* =— -Mx) 


(3-102) 


On differentiating (98) k times with respect to x, it is seen at once that 


(-!)*(!-*)■ 


"(AM A) 


r -77-^ z x (3-103) 

\=k a! 


A function of great importance in quantum mechanics is the associated 
Laguerre function , for it describes, in a sense to be discussed fully in Chap- 
ter 11, the motion of the electron in the hydrogen atom. It satisfies differ- 
ential eq. (71) in sec. 2.16, and was there shown to be represented by 


y n ,k = e- xl2 x (k ~ l)l2 L L n (x) 


(3-104) 


Certain integrals involving this function are often used and will here be 
calculated. They are of the form 

In,m = f e^x^L^Llix) ■ x p dx 
Jo 

where p is another integer which we shall take in this work to be either 
1, 2, or 3. Furthermore, our interest will be confined to 7 n>n . If we multi- 
ply eq. (103) in which z x is written for z, by a similar one in which 2 is 
replaced by z 2 , there results 


x,ju=A: X!/x! 


K{x)Ll(x) 


= (z lZ2 ) k ( 1 - - z 2 y 


_1 exp 


■XZi xz 2 
- *1 1-22 


Let us now multiply each side of this equation by e~ x x k f p ~ l and then inte- 
grate with respect to x. In view of the definition of I n , mi the result may 
be written 

£ -~rh,^(^2 ) k Id - *i)(l - z 2 )r k - 1 

\ tfl =k Alp! 


r x k+p ~ l exp \x(l 1 — )] ■ 

Jo \ 1 — Zj 1 — z 2 ) _ 


~ ax x r dx = a~ r ~ l r\ 


as may be shown by r-fold partial integration or from eq. (10). If we. put 



3.11 


SPECIAL FUNCTIONS 


we obtain, therefore, 

(Z.Z 2 )*(1 - 2l)^(l -Z*)*- 1 


z\4 

LtttA., = 
xj* x! m! 


(fc + p - 1)! (3-1 


(1 - Zl Z 2 ) k +P 

When the denominator on the right is expanded by the binomial theorer 


(1 - z 1 z 2 ) 


r-'k—p 




k + p + X 


(fc + p + X 


‘) 

/ 

1 )- 


x (fc + p- 1)!X! 


(2l2 2 ) X 

(ZlZ2) X 


the right-hand side of (105) becomes 


(1 


- Zl)^! - Z2) P ~ 1 Z 

X 


(k + p + X- 1)! 


(«i* 2 )* +x (3-1 


X! 


Thus, in view r of (105), I n#n is simply (n!) 2 times the coefficient of (z\i 
of this expression. 


a. When p = 1, this is obtained by choosing that term of the summal 
in (106) for which k + X = n, that is X = n ~ k. 

/»,« = (n!) 3 / (n - A:)? (3-10 


b. When p = 2, (106) becomes 

Z ( fc + \+ 1)! [ (2l22 )* +x _ 4+X+14+X _ **«*£+** + (2l 2 2 )^X+l] 
x a! 


The second and third terms in the bracket, in which z\ and z 2 appear w 
different exponents, cannot contribute to I n ^ n ] the first terms contrib 
when X = n — k, the last when X = n — k — 1. Hence 




(n!) 2 f~ — + 1 ^ ! -j — 

L(n — k ) ! (n — k — 1)! 

(n!) 3 

<2n_i;+1) 




(3-10 


c. When p = 3, the significant parts of (106) are 

Z- (fe - ±Y T ± -— [( Zl z 2 ) i+X + 4 (z 1 z 2 )* :+x+1 4- (z!Z 2 ) fc+x+2 ] 
x A? 

terms with different exponents of Zi and z 2 having been omitted. Cox 
quently, 

2 f(n + 2)! 4(n + 1)! w! 1 


131 


LAGUERRE POLYNOMIALS AND FUNCTIONS 


3.1 


Obviously, the same process permits the evaluation of for any value t 
p. The quantities I„, m for n m are rarely needed, but can be obtaine 
by this method also. 

Example. The electronic charge of the hydrogen atom is distribute 
about the proton as origin in accordance with the distribution function 
P(p) = cp^VlL^p)] 2 


as will be shown in sec. 11.13. In this expression n and l stand for tl 
“ total ” and “ angular momentum ” quantum numbers which designal 
the state of the atom; c is a constant which is different for different state 
and p is proportional to r, the radius vector: p = (2 /na 0 )r. The propo 
tionaiity constant depends on the quantum number n and differs for diffe 
ent states or the atom, a 0 is the fundamental constant known as the fir; 
“ Bohr radius.” P(p), finally, represents the charge to be found withi 
the spherical shell enclosed between p and p + dp. 

Let it be desired to find the mean value of 1/r and r for this distribi 
tion. Clearly 


-r-i = 


Jo r 

J o > ( p } dr 


_2_ r” P(p)dp 

TICLoJ o p 

f; p(p)d P 


The integral in the numerator is simply Jn+z, n+z, with k = 21 + 1 ar 
p = 1, that in the denominator is also I n +i t n+i with the same /c, but wil 
p = 2. Hence, using (107a and b) 

— 211 


Similarly, 


r — ~ — 2 

na 0 2 n n*a 0 

f °° P(p)rdr ~f Ca p{p)pdp 
J 0 Z J q 


r P{p)dr f °° P(p)dp 

Jo J o 

na 0 I n +i,n+i(p = 3) 


2 / n+ z,n+z(p = 2) ’ 


with k = 21 +• 1 


In view of (107c and b) 
na 
~2 


na 0 6 n 2 — 21(1 + 1) a 0 r _ 2 
— = — [3n 2 


2 n 


i(i + 1)] 


3.13 


SPECIAL FUNCTIONS 


3.12. Generating Functions. — A simple and powerful way of 
ing functions of the more unfamiliar types is by means of gener < 
tions, that is, functions of two arguments which, when expanded 
series with respect to one argument, contain the functions to be 
as coefficients involving the other argument parametrically. E 
generating functions have occurred in the preceding sections; th( 
be exhibited once more for easy reference. 

1. Legendre Polynomials. 

(1 - 2xi ) + y 2 r 112 = £ Pi{r)y l (Cf. eq. 24) 

Z= 0 


2. Associated Legendre Polynomials. 

(2m)!(l - x 2 ) ml2 y m 


- 2 xy + y 2 ) 


,2\m+lf2 


= T.P?(x)y l 


1 — 77 


This was not used in the text, but is easily derived by diff 
from (24) on the basis of (43). 


3. Bessel Functions ( of integral order). 

exp [j(u-i)] = £ J n (x)u n (Cf. 64) 


4. Hermite Polynomials. 

exp [x 2 — (z — x) 2 ] = £ 

5. Laguerre Polynomials . 


21 IP ^ 71 _n 


Jm* • 

, 1=0 n ! 


2" (Cf. 84) 


98) 


(1 - 2 )- 1 exp (— — ) = £ 2 » ( C f. 

\1 — 2 / n=o n\ 

6. Associated Laguerre Polynomials. 

\1 — 2/ \1 - 2/ n =i 7l! 


7. Tschebyscheff Polynomials. 

1 — xy 


1 — 2xy + y 


,2 = £ r»(*)i/“ 


i =0 


fNVvf. nmvprl in but. pn 9— K4 ^ 


133 


LINEAR DEPENDENCE 


3.13 


exists such that 

n 

= o 

1 

If this relation can be satisfied only by putting all equal to zero, the 
functions are linearly independent. 

A criterion for linear dependence is easily derived. We observe that 
the integral 

Hh-'-K) = f I Cpx*>x) 1 2 dx (3-108) 

taken over the range of x in which the functions <p\ are considered, cannot 
be smaller than zero. It will attain the minimum, zero, for specific values 
of the parameters fa. Now it will first be shown that, if I has a stationary 
value at all, this value must be zero. 

For this purpose, let us vary /, replacing every fa by fa(l + 8k). The 
result is 

J + 81 = (1 + 8k) 2 I 
and 

81 = [2S/c + (8k) 2 ]I 


Where I has a stationary value, 81 must vanish; but it is seen that 81 
cannot vanish unless I itself is zero. Therefore the stationary value of 
(108) is zero, and we may say that the conditions 


dl_ 

dfa 



A = 1, 2, • • • n 


(3-109) 


are both sufficient and necessary for the vanishing of / or, what amounts to 
the same thing, for the validity of 

'll fa<P\ = 0 
i 


If, therefore, eqs. (109) have a solution other than the trivial one in 
which all fa are zero, the functions <p\ are linearly dependent. 

But (108) may be written in a different way. If we define the 
coefficients 


— 



we have 


I = Zax/xfcxfcp 

\n 


and eqs. (109) now read 



3.14 


SPECIAL FUNCTIONS 


These are identical because a M x = a**, so that one is merely the comp 
form of the other. Now the condition that (110) shall have a non-vanish 
solution k x , k 2) • • • k n is that the determinant 

j O'Xn | =0 

This, therefore, is the condition for linear dependence of the functions 
Conversely, if | | 0, the set of functions is linearly independent. r 

determinant | | is named after the mathematician Gram. 

A simpler test, applicable when the functions <pi, • • * <p n are differ 
tiable n — 1 times within their range of definition, may be conductec 
follows. If the functions are linearly dependent, 

n y 

— 0 
i 

== o L 

1 I 


IW?- 13 = 0 J 

These n homogeneous equations may be regarded as determining 
set of constants k\ . It will be shown in section 10.9 that they poss 
solutions other than = k 2 = £3 • • * k n = 0 only if the determinant of 
coefficients of k\ } called the Wronskian , 


<p 1 

<P2 

‘ * <Pn 

/ 

/ 

/ 

<p x 

<£>2 

* * <Pn 

>- 1) 

(n — 1) 

(n 

<P 1 

<P2 

• * <Pn 


For independence of the solutions, then, the Wronskian must 
vanish. It should be stated, however, that the vanishing of this deter 
nant is not a sufficient condition for linear dependence of the functions. 

3.14. Schwarz’ Inequality. — Let / and g be any two functions of x s 
that the integrals 

A = Jffdx, B = J f*gdx, C = J g*gdx (3-1 

exist. The integrations extend over any definite range of the variabl 
Certainly the integral 

J + <7*(x)] [X/ (x) + g{x))dx = A \ 2 + (B* + B)\ + C 



135 SCHWARZ' INEQUALITY 3.14 

has no real roots in X, But the roots of AX 2 + (J3* + B)X + C are given 
by 


x= - ^2Z^ ± iI v/(BH£ + B)2_44C 


They are real unless 


4 AC > (R* + Bf 


(3-112) 


The equality sign here hold3 only when g = const. X /. 

The right-hand side of (112) is twice the real part of B. Hence* if/ and 
g are real functions, the inequality becomes 


J fdx • / ff 2 dx ^ (J'fgdx y (3-113) 


which is one form of Schwarz' inequality. 

For complex functions / and g, (112) may be modified. Write/ and g 
in polar form: 

f(x) = Pi (x)^ 6 '^ ;g(x) = P2 (x)e mx) 


Then B = J p^e 1 ^ 2 6{) dx. Since (112) holds for every pair of func- 
tions/ and g (which have integrabie squares), it must also be true when 
g is replaced by g — ge l(ei ~‘ 62 \ But the substitution of g for g leaves the 
values of A and C unchanged while it converts both B * and B into 


J* pip 2 d.r = | B |, which is the modulus of B. Hence 
J f*jdx J g*gdx > | J fgdx | 2 


(3-114) 


This is the more general form of the Schwarz inequality. Further gen- 
eralization to functions of more than one real variable is obvious. 

A relation like (114) is also valid for sums: 

dm (£ ijfft) ^ | htoi I 2 (3-u5) 

i i i 

For ordinary vectors U and V this is equivalent to 

U 2 V 2 > (U ■ V) 2 


Prnhlftm. Prnvf> infirmnlitv Him 


3.14 


SPECIAL FUNCTIONS 


1 


REFERENCES 

Magnus, W., and Oberhettinger, F., “ Formulas and Theorems for the Special Functi 
of Mathematical Physics,” Chelsea Publishing Co., New York, 1949. 

Churchill, P. V., “ Introduction to Complex Variables and Applications,” McGraw-^ 
Book Co., New York, 1948. 

Forsythe, A. R., “ Theory of Functions of a Complex Variable,” Cambridge, 1893. 

McLachlan, N. W., “ Complex Variables and Operational Calculus,” Cambridge, h 
York, 1939. 

Cambi, E., “ Bessel Functions,” Dover Publications, New York, 1948. 

Gray, A., Mathews, G. B., and MacRobert, T. M., “ Treatise on Bessel Functioi 
Macmillan, London, 1922. 

McLachlan, N. YV., “ Bessel Functions for Engineers,” Oxford, New York, 1934. 

Watson, G. N., “ Treatise on the Theory of Bessel Functions,” Cambridge, New Y< 
1944. 

NBS Mathematical Tables Project, “ Tables of Jo and J i, No and N i for Com] 
Argument,” Columbia University Press, New York, 1947-1950. 

Nielsen, N., “ Theorie der Gammafunktion,” B. C. Teubner, Leipzig, 1906. 

MacRobert, T. M., “ Spherical Harmonics,” Methuen, London, 1927; Reprint, Dc 
Publications, New York, 1948. 

Tallquist, H. J., “ Tafeln der 32 Ersten Kugelfunktionen P n ( cos 0),” Acta Soc. > 
Fennicae , Ser. A, Tom. II, No. 11, 1938. 

Hobson, E. W., li Theory of Spherical and Ellipsoidal Harmonics,” Cambridge, I s 
York, 1931. 

NBS Mathematical Tables Project, “ Tables of Associated Legendre Functioi 
Columbia University Press, New York, 1945. 

McLachlan, N. W., “ Theory and Application of Mathieu Functions,” Oxford, T 
York, 1947, 

NBS Mathematical Tables Project, “ Tables Relating to Mathieu Functions,” Colurr 
University Press, New York, 1951. 


CHAPTER 4 


VECTOR ANALYSIS 

4.1. Definition of a Vector. — A physical quantity possessing both 
magnitude and direction is called a vector; typical examples are velocity, 
acceleration, force and angular momentum; other quantities such as mass, 
volume, temperature and time, having magnitude only, are called scalars. 
It is customary to represent vectors by letters in bold-face type and scalars 
in italics, so that A stands for a vector whose magnitude is A . This custom 
will here be followed. A vector may be indicated graphically by an arrow 
drawn between two points, tail and head of the arrow being its origin and 
terminus, respectively; the scalar part of the vector equals (or is propor- 
tional to) the length of the arrow and the direction of arrow and vector 
coincide. 

It is often necessary to locate a vector relative to a coordinate system, 
which may be done by giving the coordinates of origin and terminus. Let 
the selected coordinate system be the usual right-handed 1 Cartesian one 
with three mutually perpendicular axes A r , Y and Z so oriented that if the 
positive X-axis points towards the reader’s right and the positive T-axis 
towards the top of the page, the positive Z-axis will point up from the page 
towards the reader. Let the coordinates of origin and terminus of A be 
U'i,l/i,2i) and {xo, 1 ) 21 * 2 ), respectively; then the three rectangular Cartesian 
components of A, relative to the axes A”, Y, Z, are defined to be 

A z = x-2 — X \ ; A y = \)2 — Ui) A 2 = Z 2 ■— z\ 

The length of the vector is the distance between the two points: 

A = y/ A~ x +* Ay + A 2 

The two points might be located relative to many other coordinate 
systems, one of which could be obtained from the previous one by rotation 
of the axes to A" 7 , Y f , Z f and translation of the origin from 0 to O' as shown 
in Fig. 1. Suppose the coordinates of O f in the first system are (x 0 ,g 0 ,Zo), 
then the position of the second system is determined with respect to the first 



4.1 


VECTOR ANALYSIS 


when the angles between 0'X ; , O f Y f , 0 f Z' and OX, OF, OZ are kn< 
The cosines of these nine angles are given in Table 1, where for the pre 
purpose the first row and column are to be used; for example, a i3 is 
cosine of the angle between the two straight lines 0 / X / and OZ. 

TABLE 1 


OX OY OZ 



. 4 , 

Ay 

A z 

o ' x ' 

A' x 

an 

«12 

^13 

O'Y' 

< 

«21 

«22 

023 

O'Z* 

a: 

a 31 

«32 

033 


In order to locate the vector in the system O'X'Y'Z ' , it is necessary 
to obtain relations between the nine direction cosines. From a well-kr 



formula of solid analytic geometry, if 0 is the angle between two strc 
lines, whose angles with the coordinate axes are ai, ft, 71 , a 2 , ft, 72 , 

cos 6 = cos cos a 2 + cos ft cos ft + cos y t cos 72 ( 

If the lines are perpendicular to each other, as is true for the axes OY 
OZ , cos 0 = 0, hence 


&12&13 + &22&23 + &32G33 = 0 


( 


of the direction cosines of any line is unity; for example, for the line OX 
relative to 0 f X'Y'Z' 

<*n H~ <*21 + dsi — 1 (4—3) 

There are yet ten more relations, in addition to the twelve expressed in 
eqs. (2) and (3). Nine of them are of the type an = < 222^33 — <* 32<*23 and 
the tenth is the determinant of the cosines, which equals unity. It is 
evident that the nine cosines are not linearly independent. 

Now let (xi,yi,zi) and (x f ]} y[,z[) be the coordinates of the same point P 
in OXYZ and O' X'Y'Z' and let oq, (3 U y l7 a 2 be the direction angles of 
O'P with OX, OY, OZ, 0' X\ Then from {l)\nd Table 1, 

x[ = 0 f P cos (Xo — 0 P {a,\\ COS Oi\ <2i2 cos T - ^13 COS Yi ) 


— a u(xi ~~ x 0 ) + <212(2/1 — y 0) + — zq ) 

In like manner, 

2/1 = a 2 i(x! - x 0 ) + <*22(2/1 - 2/o) + <123(21 - z 0 ) 

Z 1 = <*31 (^1 ~ -To) + <*32(</l ~ 2/o) + <*33(^1 “ 2(j) 


Similarly, if the components of A in 0' X'Y'Z' are 


then, 


Al 



t A f / tit r f 

**i> Ay 2/2 iji, A z z 2 z 1 

= a n A x + a\ 2 Ay + a\%A z 


(4-4) 


and two other expressions for A ' u and A [ may be derived in the same way. 
These three equations may be solved for the unprimed quantities in terms 
of the primed ones, or the same method may be continued to give three 
relations like 

A x = cl\\A x + aoiAy + a$i At (4-5) 


All of them are symbolized, in self-explanatory fashion, in Table 1 if the 
second row and column are used. While it is usually true that the com- 
ponents of a vector are different in different reference frames, certain proper- 
ties such as the length and the angle between two vectors are equal in all 
frames. It is readily shown, using (2), (3) and (5), that 

A = A' = VaY+A?Ta? 


Considerable simplification often results in expressing physical laws 
in vector notation, without reference to a selected coordinate system. The 
transformation properties just described, however, show that it is always 
possible to list the components of a vector in any given reference frame 
when so desired. In accordance with these ideas, a vector is sometimes 


4.3 


VECTOR ANALYSIS 


if the numbers are then referred to a second frame, they will bee 
(A x1 Ay,A r z ) with relations as given by Table 1. Provided these condit 
are met the vector is said to be a proper vector. This analytical defini 
is more restrictive than the intuitive conception of a vector as a quar 
possessing magnitude and direction, but it leads naturally to the r 
general idea of the tensor and it may be readily extended to the vectc 
n-dimensional space as used in many branches of modern analysis. M 
over, the analytical definition is more precise than the usual one, w 
offers no explanation of the words “ magnitude and direction.” The t 
components of a vector define these words provided they are the saxr 
all reference frames. Further comments on this matter will be foun 
sec. 4.21. 

4.2. Unit Vectors. — Vectors of unit length, drawn along the axes 
OF, and OZ, respectively, are called unit vectors (cf. Fig. 2); they are d< 
nated by i, j, and k, respectively. Any directed line along either of 



three axes is also a vector, for if its length is A x units along the X-axis, 
scalar magnitude is thereby given and its direction is specified by the 
vector i, the whole vector being designated by A x i. Similar vectors c< 
he drawn along the F- or Z-axes, A y j or A z k. 

4.3. Addition and Subtraction of Vectors, — Referring: aeain to Yu 


141 


THE SCALAR PRODUCT OF TWO VECTORS 


4.4 


and A z k may be replaced by more general symbols, A and B. The resultant 
vector C representing the diagonal is called the vector sum of A and B, 

A + B = C 

The addition of vectors thus obeys the familiar rule for composition of 
forces in mechanics. To obtain the difference of two vectors, A — B, it is 
only necessary to define the negative of a vector. This is taken to mean 
a vector whose length is equal and whose direction is opposite to that of the 
original vector. Thus A — B = A + (— B). Hence the rule: To form 
the difference of two vectors graphically, reverse the direction of the minu- 
end and complete the parallelogram as before. 

From the parallelogram law, it is seen that any vector in a plane may be 
resolved in numerous ways into two components in the same plane, and 
that a vector in space may be resolved in numerous ways into three com- 
ponents, not in the same plane. If the resolution is made along the rectan- 
gular axes, the result may be symbolized in terms of unit vectors, 

C = A x i + A z k 

and 

R = A x i + A y ] + A 2 k (4-6) 

From the geometry of Fig. 2, the lengths of C and R are 
C - (Al + AD' a 

R = (A 2 X + A 2 y + A\) A (4-7) 

The laws which govern addition and subtraction of vectors are easily 
seen to be associative, commutative, and distributive. Multiplication 
of a vector by a scalar is understood to mean multiplication of its length 
by the scalar factor, without change in its direction. Vector algebra thus 
developed enables one to demonstrate many geometrical theorems in a 
simple way. 2 

Problem a. Prove that the diagonals of a parallelogram bisect each other. 

Problem b. Prove that the line that joins one corner of a parallelogram to the 
middle point of an opposite side trisects the diagonal and is trisected by it. 

4.4. The Scalar Product of Two Vectors. — The scalar (or inner) prod - 
udt 3 of two vectors is defined by 

A • B = AB cos 6 (4-8) 

where 6 is the angle between A and B. It follows that the scalar product of 

2 Numerous examples may be found in books on vector analysis; see for example: 
Phillips, H. B., “ Vector Analysis,” John Wiley and Sons, New York, 1933; Gibbs- 


two perpendicular unit vectors must vanish since 6 = ir/2, cos 6 = 
Similarly the scalar product of a unit vector by itself must equal un 
since 8=0, cos 8 = 1. In vector notation, 

i • j =ji=i*k==ki = jk=kj = 0 (4 

| . i = j . j = k • k = i 2 = j 2 = k 2 = 1 (4-: 

If A = B, 8 = 0, so from (8) and (10), 

A • A = A 2 = A x + A? y + A 2 



an equation which defines the square of the length of A (see also Fig. 
If 

A • B = 0 (4- 

for any two vectors, A and B are perpendicular to each other, unless < 
vanishes; if 

A ♦ B = AB 

then A and B are parallel. In a Cartesian system, 

A • B = A X B X + AyB y +MB Z (4-1 

The scalar product obe} r s the rules of ordinary multiplication 

AB = B • A 

A • (B + C) = (A • B) + (A • C) 

From (8), it is seen that any relation involving the cosine of an inclu« 
angle may be written in terms of the scalar product. For example, 
mechanical work W done by a force F which makes an angle 6 with 
displacement D is W = FD cos 8 or in vector notation, W = F • D. 

Problem. If A and B are the sides of a parallelogram and C, D are the diagor 
show that C 2 + D* = 2 (A 2 + J3 2 ); C 2 - D 2 = 4ABcos AB. 

4.6. The Vector Product of Two Vectors. — Let two arbitrary vect< 
A and B, be drawn from a common origin 0 with an included angh 


on- 


14d THE VECTOR PRODUCT OF TWO VECTORS 4.5 

(cf. Fig. 4). Then from (11) and (12) 

C A — C X A x ~j~ C V A V CyAg — 0 
C B = C X B X + CyB v + CzB z = 0 

Solving, we find 

C x = m(A y Bz — A z B y ) 

C v — m(A z B x A X B Z ) (4—13) 

C z - m{A x By — A V B X ) 



where m is an arbitrary constant, which is conveniently taken as +1. 
Then from (6) and (13), 

C 2 = c! + Cl + c! = (Al + Al + A 2 )(Bl + Bl + %) 

— ( A X B X + AyB y + A Z B Z ) 2 

The first member on the right-hand side of this equals A 2 B 2 by (6), the 
second member equals (A • B) 2 = ( AB cos 0) 2 by (12) and (8), hence 

C 2 - (A 2 B 2 - \ 2 B 2 cos 2 6) = (AB sin d) 2 

The vector C may thus be described as the product of two other vectors. 
A and B; it is called the vector (or skew) product 4 and is written 

C = A X B 

Its length is C =AB sin 9 ; its direction is perpendicular to the plane deter- 
mined by A and B. Using (13) and the unit vectors, we may also write 

C = A X B = (A y B z - A z B v )i + (A Z B X - A X B Z ) j 
-f- (A Z By — AyB x )lSi 


4.6 


VECTOR ANALYSIS 


This may be put in the form of a determinant: 

i j k 

A X B = A x Ay Ay 
B X By By 

As a consequence of (14), vector products of the unit vectors be 

i X j = — j X i = i j k = k 
10 0 
0 1 0 

j X k = -k X j = i; k X i = -i X k = j 

and 

i X i = j X j = k X k = 0 

Eq. (14) shows that vector multiplication is not commutativ 
A X B = -B X A 

The distributive law of ordinary multiplication, however, is reta: 
AX (B + C) = AXB+AxC 
(A + B) X (C + D) = AXC+AXD + BXC+BX 

Problem. Prove by vector methods the trigonometric relations 

cos (x zk y) — cos x cos y T sin x sin y 
sin (x dtz y) = sin x cos y =fc cos x sin y 

Hint: Take three vectors : A = cos xi + sin xj 
B = cos yi -f* sin y\ 

C = cos yi — sin yi 

Form the scalar and vector products. 

The close connection between the vector C = A X B and the 
gram whose sides are A and B suggests that it may be useful ge 
represent areas by vectors. The convention usually adopted in 
nection, with reference to plane areas, is the following. The are 
sented by a vector perpendicular to the area; and of length ec 
size. This leaves the direction of the vector undetermined. r 
is fixed relative to the sense in which the contour of the area is 
it is taken to be that direction in which a right-handed screw woul 
when turned in the sense in which the contour is to be described. 


THE VECTOR PRODUCT OF TWO VECTORS 


a. Moment of a Force. In mechanics, the moment of a force about a 
point 0 is defined as the product of the force by its perpendicular distance 
from 0. From the geometry of Fig. 5a this product equals twice the area 
of the triangle OPQ. It may be represented as 

1 = D X F 

where M, D, and F are vectors representing the moment, perpendicular 
distance and force, respectively. The sign of M, fixed by the previous 




Fig. 4-5 


definition of the area as a vector quantity, is positive on that side of the 
plane passed through 0 and the line F on which the force tends to produce 
a rotation about 0 in the positive direction. If D be drawn from 0 to any 
point in the line of action of F (cf. Fig. 5b), the perpendicular distance is 
D sin 0 and the moment is still given by the vector product, D X F. If the 
force has components F XJ F VJ F z and D has components D x , D v , D z , the 
components of M are 

M x = (D V F Z - D z F y ) 

M v = (D Z F X - D X F Z ) 

M z = (D X F V - D y F x ) 


J 1 A, 


T/„7. 




rotation of the body is then described by the vector ca> with 1 
the scalar co and direction parallel to the axis of rotation. ] 
convention, is positive in the same direction in which 2 
screw would progress under the given rotation. Any point 
axis (cf. Fig. 6), will then describe a 
circle concentric with, and in a plane 
perpendicular to the axis, this point 
being determined by any vector R 
drawn from a point 0 on the axis of 
rotation. The linear velocity of P is at 
right angles to both o> and R, its mag- 
nitude being L — uR sin 0, or in vector 
symbols 

L = (s)XR (4-16) 

4.6. Products Involving Three Vec- 
tors. — From three arbitrary vectors, 

A, B and C, the following products 
may tentatively be formed : 

(a) A(B • C) (d) A(B X C) 

(i b ) A • (B X C) (e) A • (B • €) 

(c) A X (B X C) (/) A X (B • C) 

Of these expressions, (e) and (/) are meaningless since \ 
have only been defined when vectors stand on both sides of t 
Furthermore, no meaning has been attached to two ve 
together in the absence of one of these signs, hence (d) is 
here. 

a. Since (B • C) = BC cos d is a scalar, the triple produ 
new vector whose direction is the same as that of A; its mag 
multiplied by BC cos d. 

b. The product A • (B X C), called the scalar triple pm 
for (B X C) = I), a new vector. We have 

A • (B X C) = (BXC)‘A=A.D=D-A=i 

Moreover, the new vector D is perpendicular to both B and 

B • (B X C) = C • (B X C) = 0 

If the three vectors A, B and C are the edges of a parallel 
in Fie-. 7. then (B X Cl is a vector whose length enuals 



147 


PRODUCTS INVOLVING THREE VECTORS 


4.6 


parallelogram forming the base of the parallelepiped; its direction is per- 
pendicular to the plane of B and C. The scalar triple product is thus the 
area of the base multiplied by the projection of the slant height of A on the 



Fig. 4-7 

vector (B X C) or a scalar whose magnitude equals the volume of the paral- 
lelepiped, v. By taking various faces in turn, we find from Fig. 7, 

A • (B X C) = B - (C X A) = C • (A X B) = v 

Since a change of order in the vector product changes the sign there are 
other possible relations which may be abbreviated by writing 

v = [ABC] = A • (B X C) = (B X C) • A, etc. (4-17) 

and 

[ABC] = [BCA] - [CAB] = — [ACB] = —[BAG] = — [CBA] (4-18) 

Each term in square brackets stands for the two possible ways of writing 
the triple product as shown in (17). It also follows from (18) that the 
cross and dot may be exchanged at will, provided the cyclical order of the 
three vectors is retained. The parenthesis in a product like A • (B X C) is 
superfluous but it is often written for clarity. 

Because of (15), the scalar triple products of unit vectors all disappear 
except 

[ijk] = -[ikj] = 1 

which follows from (9) . If the three vectors A, B, C are' written in terms of 
unit vectors and the indicated multiplications performed, the use of (9), 
(10) and (15) gives 

[ABC] = A x B y C z + B x CyA z + C X A V B Z - A x CyB a - BjLyC, - C x ByA z 
A X Ay A z 
= B x By B z 
C X Cy C Z 


(4-19) 


4.7 


VECTOR ANALYSIS 


is therefore perpendicular to both of its components: 

V • A = 0; V * (B X C) = 0 

but V must lie in the plane of B and C, since it is perpendicular to tl 
product of B and C, which itself is perpendicular to both B and C. 

The most important property of this triple product is that it 
decomposition into two scalar products: 

A X (B X C) = B(A • C) — C(A • B) 

a relation which may be proved geometrically or analytically by e? 
in Cartesian coordinates. Since the vector product changes its si 
the order of multiplication is changed, the sign of the triple vector 
will change when the order of the factors in the parenthesis is ch 
when the position of the parenthesis is changed: 

A X (B X C) = —A X (C X B) = (C X B) X A = — (B X C 

Products of more than three vectors may always be reduced to c 
three preceding types of triple products by successive applicatic 
above rules. 

Problem. Verify the relations : 

(A XB). (C X D) = (A • C) (B • D) - (A • D)(B *C) 

(A X B) X (C X D) = B[ACD] - A[BCD] = CfABB] - D[ 
[A X B B X C C X A] = [ABC ] 2 



Fig. 4-8 


4.7. Differentiation of Vectors. — If a vector R is a function o 
scalar t, which for convenience may be assumed to be the time, 
three possible ways in which R may vary. Let Ri and R 2 refer t 


SCALAR AND VECTOR FIELDS 4.5 

curve is traced by the terminus of a continuously varying vector R, the 
origin of the latter being kept fixed at the origin of a coordinate system. 
The vector AR = R 2 — Ri, having the direction of the secant AB of 
Fig. 8, approaches the tangent of the curve C at the point Rj as At = t 2 — t\ 
approaches zero. The quotient AR /At is the average rate of change of R 
in the time interval between t\ and fe- Following the usual methods of 
differential calculus, the derivative of R is defined as 

lim*? = ^ 

At dt 

In terms of unit vectors, and with the use of primes for differentiation 

R = i R x + j R y k R z 

R' = iffi' + j R' y + k Ri 
R" = iR" + j R' y ' + k R' r 

For a composite function of two or more vectors, each of which depends on 
the single scalar i , the usual rules of differentiation hold except that, of 
course, the order of the vectors must not be changed in cases involving the 
vector product. 

In the special case of Fig. 8a, where R is constant in direction but vari- 
able in magnitude, AR is parallel to R. Similarly, in case (b), AR is 
perpendicular to R, for the fixed length of R is R * R = R 2 , d(R • R )/dt = 
0 and hence R • dR/dt = 0, the latter being the requirement that R and 
dR/dt be perpendicular. 

4.8. Scalar and Vector Fields. — A scalar field is defined as a region of 
space, with each point of which there is associated a scalar point function 
(cf. sec. 1.7). A simple example is the temperature of points in the atmos- 
phere at a given moment. On the other hand, if there is a vector associ- 
ated with each point in a region of space, the points and vectors constitute 
a vector field, an example being the wind velocity of points in the atmosphere 
at any instant. 

Suppose 4 )(x,y,z) is a scalar point function referred to a given coordinate 
system. It will usually change its form if referred to another system, say 
4> r (x' ,y',z f ) but its value at any point must be unchanged, or 4> = 4>'. For 
example, the temperature at any point in the atmosphere cannot depend on 
the coordinate system used to describe the point. Differentiating 4> = 4> f 
partially, we obtain 

d4> dx dcj> dy d4> dz d<t> 
dx f dx dx dx f dy dx dz 


A, 




.St A. 


— = a 2 1 — 
dy dx 


+ ®22T h a 23 ~ 

dy dz 


dj/_ 

Hz' 


d<j> d4> d<j> 

a 3l T" + a 32 — + a 33 “ 

dx dy dz 


with three similar equations for d<j>/dx, d<j>/dy, d<j>/dz. Compai 
derivatives with (4) and (5), it follows that (d</>/ dx, d<j>/ dy, d<t>/d\ 
three components of a vector since they transform from one refere 
to another in the manner prescribed for vector components. 

Using the abbreviation 

V - id/dx + jd/dy + kd/dz 


let us study the quantities V * where \p is either a scalar or a -v 
(*) is either to be omitted or replaced by a dot or a cross in ord 
products which have meaning. The operator, V, called u del] 
vector in the geometrical sense since it has no scalar magnitude, t 
transform properly, so that it may be treated formally as a vec 
possible products are V<£, where <j> is a scalar point function; 

V X V, where V is a vector field. 

4.9. The Gradient. — The first of these products, called the g. 
the scalar <t> 


= grad = idcf>/dx + j 3<j>/dy + k d^/dz 


is a vector, since it is the product of a scalar 4> and a vector V. 
ceive its physical significance, let us consider the family of 
<t>(x,y,z ) = constant, or the equivalent of this relation 

d(j> — (dcf)/dx)dx + (< d(j>/dy)dy + ( dcj>/dz)dz = 0 


At any point P with coordinates (x,?/,z), on one of these surfa 
i dx + j dy + k dz is a vector, tangent to P, provided dx, dy, dz s 
preceding equation. Since V0 * dR = dcj> =0, dR and ar 
dicular to each other, or V<£ is perpendicular to that surface of i 
which passes through P. By the convention of signs previou 
lished, the direction of V<£ is that in which <j> is increasing. For 
direction determined by the unit vector s with direction cosin 
through P, the component of in the direction s is 

ld<t>/dx + mdfyjdy + nd<p/dz 

which may be written s * V<£. This is the directional derive 
in the direction s. In going from P on one of the surfaces (< 
to any point Q on the surface 0 + d $ , the increase in <p is the sam* 
the point Q is chosen, but the distance PQ will be smallest and h< 


greatest when s is in the direction of the normal N. Therefore, since 
is normal to the surface <j> = const., at the point P its direction and magni- 
tude give the maximum space rate of increase of the scalar <£. 



4ol0« The Divergence, — The scalar product of the vector operator V 
and a vector ¥ gives a scalar which is called the divergence of ¥. 


V . V - d,W - {>£ + j • (IF, + >V, + kV.) 

= dV x jdx + dVy/dy + dV z /dz (4-23) 



If V is a vector field, the 
derivative dV z /dx trans- 
forms, when a change of 
coordinate system is made, 
like the product A X B X of 
the 2 -components of two 
vectors A and B, hence the 
divergence of ¥ is a scalar 
point function. Suppose 
that V represents at each 
point in space the direction 
and magnitude of flow (den- 
sity times velocity) of some 
fluid such as water or a 
gas, or that it represents 



4.11 


VECTOR ANALYSIS 


152 


The loss of fluid mass through face A BCD per unit time is 

av dx 1 


i • | V(x,y,z) + 

while the gain through EFGH is 


dx 2 


dydz 


V(x,y,z) - “ d V dz 


Therefore the net loss through these two faces is 

av 

i • — dxdydz 
dx 

The losses through the other two pairs of faces are 

av av 

j dxdydz and k dxdydz 

dy dz 

The total loss from the parallelepiped is therefore 


f. av . av t av] 

I 1 * ~ b J • b h • — | 

[ dx dy oz J 


dxdydz = V * Vdr 


If v is the velocity of the fluid of density p, V = pv is called the flux density 
and represents the total flow of fluid per unit cross section in unit time. 
Then if no fluid is created or destroyed within the parallelepiped, this loss 
of mass must equal — (< dp/dt)dr , 

-dp 


7- V = 


dt 


a relation usually called the equation of continuity. If the liquid is incom- 
pressible, dp/dt = 0, hence 

V • V = 0 

A similar relation holds for D, the electric displacement, 

V • D = 0 

4.11. The Curl. — The vector product of V and V is called the curl or 
rotation of V 


curl V = V X V = i 


dv z _ ar f 

dy dz 

h 


j k 


dV, 

dz 


a v s 

dx 


This function may be used to describe the motion of a rigid body rotatin 
about an axis with uniform angular velocity, co. The linear velocity of an 
point P in the body with radius vector R is (cf. 16) L = co X R and 

curl L = V X (co X R) (4-: 25 

Expanding (25), by (20) 

curl L = co (V • R) ~ (V • co)R 

Since R = i R x + j R y + k R z = ix + jy + kz, V • R = 3. The angula 
velocity is a constant vector, hence V • co = co • V and we may write the las 
member of the above equation in the form (co • V)R, which is to be intei 
preted as the product of a scalar (co ■ V) and a vector R. Expanding, 

f d d a] 

(co • V)R = — + Wy— + 0 ) z — , l R = ico x + jc o y + koj z = co 

{ dx dy dz j 

Hence, curl L = 3co — co = 2co 

or the curl of the linear velocity of any point of a rigid body equals twic 
the angular velocity, for magnitude, not direction changes. 

4.12. Composite Functions Involving V. — The following relatior 
involving V may be verified by expanding the vectors in terms of the 
components along three unit vectors, i, j and k. 

V*(A+£) = V*A+V*B 

V * (M) = V4> * A + <£V * A (4-2( 

V (U-V) = (V- V)XJ+ (U- V)V + V X (V X U) + U X (V X \ 

V • (U X V) = V-VXU-U-VXV 

V X (U X V) = (V • V)U - V(V • U) - (U • V) V + U(v • V) 

In these equations A and B are either scalars or vectors depending on tt 
choice of (*),<£ is a scalar and U, V are vectors. If R = lx + ]y + kz, 

VR = 3 
V X R = 0 
U • VR = U 

Problem. Prove eqs. (26). 

4.13. Successive Applications of V. — There are six possible combin; 
tions in which V occurs twice. The following relations may be proved j 
above by expansion in terms of i, j and k. 

a. V • V<f> = V 2 0 = V • grad 0 = div grad <t> 


4.14 


VECTOR ANALYSIS 


154 


b. Since V 2 is a scalar, it may also be applied to a vector, the result 
being a new vector 


(v • V)V = V 2 V = 


a 2 v a 2 v a 2 v 

dx 2 + dy 2 + d# 


(4-28) 


c. 


V(V V) = grad div V 


,S 2 V X , .a 2 v, , , d 2 v, , 

‘Tr + 3 ~ + k TT + 


. \d 2 V v | d 2 \\ 
l dxdy dxdz. 


dx' 


dy' 


dz 2 


, . \d 2 V x d 2 V„ 
+ 31 ^- 7 - + 


i dxdy dydz] | dxdz dydz. 


\d 2 V x d 2 V. 


+ 


(4-29) 


d. V X V<t> = curl grad <j> = 


i j k 
d/dx d/dy d/dz 

d<t>/dx d(j>/dy d<t>/dz 


« 0 


(4-30) 


This is an identity. If for some vector V, V X V = 0, then ¥ = V<£, 
where 4> is some scalar function. Under these conditions, V is said to be 
irrotational. Expansion also yields 


e. V • V X V = div curl V = 0 (4-31) 

Thus if for any vector W, V • W = 0 then W = V X V and W is said to be 
solenoidal. 

Finally, the reader will easily check by expansion in rectangular com- 
ponents, the relation 

f. V X (V X V) = curl curl V = 

grad div V - V 2 V = V(V • V) - V • VV (4r-32) 

Problem. Show by expansion that eqs. (4-27, 28, 29, 30, 31, 32) are correct. 

4.14. Vector Integration. — As a simple example of vector integration, 
we consider the motion of a particle under the constant acceleration of 
gravity. The equation of motion is 

d 2 R/dt 2 = G 


where G is a constant vector. Integration results in dR/dt = G t + V 0 ; 
R = Ot 2 /2 + Y 0 t + C 0 , where V 0 and C 0 are the constants of integration 
which are vectors not necessarily collinear with G. They are determined 
by the values of dR/dt and R, respectively, when t = 0. 

More complicated cases may arise, however, for in the general case, the 



155 


LINE INTEGRALS 


4.15 


4.15. Line Integrals. — Suppose dr is the vector ds, where s = s(t) 
is the equation for a curve. It is then possible to form the integrals: 

(a) f <td s; (6) f ¥ • ds; (c) f ¥ X ds (4-33) 

«/e vc v c 

each of these being called the line integral along the curve C . The results 
of integration are respectively, a vector, a scalar and a vector. 

Since 

ds = idx + jdy + kdz (4-34) 

the first integral in (33) becomes: 

J <t>ds = f <p(x,y y z)(idx + jdt/ + kdz) = f <j>(x,y,z)idx 

c *1 a 

J f*V* SD 2 * 

4>(x,y,z)jdy + I <t>(x,y,z)kdz 

Vl 

where A and B are initial and final points of the curve, with coordinates 
(xi,yi,Zi) and (x 2 , 2 / 2 ^ 2 )- The first integral on the right may be evaluated 
when y and z are known in terms of x for points on the curve C. The 
remaining integrals are determined in a similar fashion. The problem thus 
reduces to the usual line integral in scalar calculus except that it is neces- 
sary to specify the direction in which the radius vector s describes the 
curve during integration, for if the direction A to B is taken as positive, 
then 

n,B /* A 


J <j)ds = — I 4>ds 

A J B 


In case C is a closed curve, the direction is always taken so that the enclosed 
curve appears positive (cf. sec. 4.5). 

No difficulty is experienced in the interpretation of (5) and (c) of (33) 
as the following example shows. Let ¥ = xyi — z 2 j + xyzk; evaluate 

J ¥ • ds from the point A = (0,0,0) to B = (1,1,1) along the curve 

s = it + jt 2 + htf 3 . 

J ¥ • ds = J {xyi — z 2 j + xyzk) • (i dx + jdy + kdz) 


f 3 

J ( xydx — z 2 dy + xyzdz) 


Since ds is the position vector of points on the curve, the coordinates of 



Hence, 



¥-ds 




An important special case arises in scalar calculus when the func 
be integrated is an exact differential, where the value of the int* 
independent of the path. In vector calculus, suppose 

¥ = grad <j> = V<£ 


with 4> a scalar point function. Then using (22) and (34) 


V-ds = f 

J A A A 


d<t> , , d<£ d<l> 

-dx+—dy + --dz 
dx dy oz 


— | d<j> — 4> B — 4> A 
j A 

If the integration is taken around a closed curve, B = A, then 
J" V4> ■ ds = j> V<t> ■ ds = 0 


Conversely, if J> ¥ • ds = 0, then (35) must hold, i.e., V is the grac 
some scalar point function 0. We have therefore shown that if V = 
the line integral f ¥ • ds depends only on the initial and final vali 

J A 

and is independent of the path. 

4.16. Surface and ¥olume Integrals. — Let 2 be any surface, divic 
infinitesimal elements each of which may be considered as a ved 
The surface integral may then be described as in ordinary analyj 
again there are three cases : 

(a) f f 4>dS ; (5) f f V • dS; (c) f f V X dS 

giving a vector, a scalar and a vector. As before, it is important to 
the side of the surface over which the integration is performed, for al 
dS is normal to the surface, the signs of the normals on opposite si 
opposite. The sign of the normal is uniquely determined by the p 
conventions except for the case of a one-sided surface 5 such as the 
strip. If the surface encloses a portion of space, dS is taken as the o 

5 See, for example, Burington, B,. S., and Torrance, C. C., “ Higher Mathe 
1VT r> fl m w- TTi 1 1 Book Co.. Now York. 1939. dd. 250ff. 


II 


STOKES* THEOREM 


4.17 


pointing normal. The surface integral a ¥ • dS is called the flux of V 


through the surface, for if V is the product of density and the velocity of a 
fluid, the integral is the amount of fluid flowing through a surface in unit 
time. The vector ¥ may also refer to electric, magnetic or gravitational 
force, flow of heat and so on. 

Let dr = dxdydz be an element of volume. Since this is a scalar, there 
are only two possible volume integrals 


(a) (b) Iff™ 


the first being a scalar and the second a vector. 

It is often convenient to convert multiple integrals into others with 
fewer integral signs. One possibility has previously been presented in (36), 


namely that the line integral I V ds may be 

J c 


reduced to the difference 


between two scalar quantities, provided ¥ = V<£. A line integral may also 
be converted into a double or surface integral by Stokes’ theorem , or con- 
versely, the double integral may be reduced to a single integral. 

4.17. Stokes* Theorem. — This theorem may be stated in the form 


/v-d s = //.* XV-dS 


(4-37) 


Conversely, if W = V X V, where V is another vector, then the value of 
the surface integral /X w • dS depends only upon values of ¥ at points 


on the boundary of the surface, 




¥ • ds 


The vector ¥ may be taken as flux density of a fluid or as the field of a 
mechanical or electrical force. In the latter case, the line integral repre- 
sents the work done on a particle moving along a curve C. If the curve is 
closed, forming the boundary of a region 2, then according to the theorem 
the work done equals the surface integral of the curl of the force field. 
In the special case where the work done is independent of the path, the line 
integral vanishes so that a requirement for independence of the path is that 
V X V = 0. 


A proof of Stokes* theorem follows. Consider a surface 2 bounded by 

/"> t i n! . xi- _ • _x* 


P f (x,y,z) of the surface. This means that on the surface, where zisa func- 
tion of x and y, a function u(x,y,z) becomes 

u(x,y,z) - <f>(x,y) (4-38) 

since the value of $ on C f must equal the value of u on C. Similarly with 
other functions 


v(x,y,z) = %(x,z); w(x,y,z) = t(y,z) 

when projections are made on the XZ- and FZ-planes. We may write for 
the vector defined at each point on the surface, 

V = ui + vj + wk 


If we furthermore take a unit vector n, perpendicular to the surface at any 
point, the right-hand side of (37) becomes after expansion 


ib 


V X VdS 


fb 


(V X ui + V X vj + v X wk)dS (4-39) 


A typical term of (39) may be transformed as follows . 


a - V X ui = n - 


du did 
*!bz~ k ~dy\ = 


d(j> 

— n • k — 
dy 


(4-40) 


the second expression coming from (24). The last member of (40) is 
obtained as follows. The partial derivative 


ds 

dy 


j + k 


dz 

dy 


of s = od + yj + zk is a vector, tangent to the curve cut from 2 by a plane 
perpendicular to the X-axis. It is perpendicular to n, hence 


n 


j 



= 0 


(4-41) 


Substitution of (41) and the partial derivative of (38), 

d<f) du du dz 
dy dy dz dy 


into (40), gives the last term of that equation. Since n • kdS = dxdy, we 
may write 



n • V X vMS 



The integral on the right of (42) may be written 


r r d $ 


r 


(4-42) 





TtlJSiUKUJM UJP i ll Hi 111 V HilUjniMUH. 


H.i-U 


where fa and fa are the values of <t> at the maximum and minimum values 
of y , V 2 and yi , respectively. If da is a line element of the contour C', we 
may write dx = ±.{dx/3a)da i choosing the sign in accordance with the 
position of da on the contour. Since it is negative at y 2 and positive at y h 
the integral becomes 

- f (fa + fa )^da = — d>dx 
J da J c , 

Remembering (38) and the fact that between two points on C f the change 
in x is the same as that between the equivalent points on C, 


so that we finally have 




n • V X uidS 



Similar equations are obtained from consideration of projections of 2 on 
the XZ- and FZ-planes. When they are added together, Stokes’ theorem 
results. 

4.18. Theorem of the Divergence. — A method of reducing triple inte- 
grals to double integrals is offered in the theorem of the divergence , which may 
be written 




(4-43) 


The Cartesian form of this equation 



a 


( V x dydz + Vydxdz + V z dxdy) 


is often called Gauss 1 theorem. Suppose V represents the flux density of an 
incompressible fluid. Then, as we have shown, V * V is the total amount of 
fluid flowing out of a volume dr per second. The total flow from a large 


volume is 



V • Vdr, which must equal the rate of flow across all of 


the surfaces of the volume 


si 


V • dS. This proves the theorem. 6 If 


we assume a steady state, the total amount of flow neither increases nor 
decreases in time and hence must be maintained constant by sources or 
sinks within the region, unless the density of the fluid is continually chang- 


4.18 


VECTOR ANALYSIS 


160 


mg (which is contrary to the initial assumption). In view of Gauss’ 
theorem the divergence of the field takes on an interesting meaning. Since 

div V = V • V = lim y- f f V ■ dS (4-44) 

dr-^O dr J 

the divergence is the same as the intensity of the steady flow at a given 
point. This argument may be continued to derive the equation of con- 
tinuity which has been obtained in another way in sec. 4.10. 

A further application of the divergence theorem arises in the problem of 
heat flow. Consider the flow of heat into a thermally isotropic solid body, 
the temperature of which is not the same at all points. The rate of flow 
of heat into the body is* 



where V is the flux of heat, the amount of heat which crosses unit area 
drawn perpendicular to the lines of flow per unit time. By Fourier’s law, 
heat flows in the direction of most rapid decrease in temperature, U , with a 
rate proportional to the thermal conductivity k, of the solid or 

V = -kVU (4-45) 

If there are no sources or sinks of heat within the body, and if p is the 
density of the solid and s its specific heat, the amount of heat entering unit 
volume in unit time is 

dU 

sp — 
dt 

For the whole body, the heat gained must equal that passing through the 
surface 

f SP ^ dr = - fv-dS 
T ot Jig 

and this becomes in view of eq. (43) 

+ H ir -o 

This equation must hold for every surface, hence 


sp _,-y.V 



161 


TENSORS 


4.20 


Thus because of (45) 


or 


dU 


So 

dt 


= V • (kVU) 


dU 

di 




with p 2 = */sp and k assumed constant. For a stationary state, V 2 U = 0; 
this is Laplace’s equation, the same law holds for the distribution of 
temperature as for the distribution of potential in charge-free space. 

4.19. Green’s Theorems. — The three fundamental relations (36), 
(37) and (43) may be used to obtain a large number of formulas for the 
transformation of integrals, the results corresponding to integration by 
parts in scalar calculus. The two most important such formulas are 
known as Green’s theorems , when given in Cartesian form. In vector 
notation, these are 


J* V<£> • V^dr = J* <t>W ■ dS — J* <j>V 2 tdr 

= J* \pV<t> • dS — J* \pV 2 4>dr 


(4-46) 



i pV 2 <t>)dr = f - W4>) • dS 
Js 


(4-47) 


Green’s first theorem is easily found by substituting V = in (43) 
The second theorem is obtained by interchanging $ and \p in (46) and sub j 
tracting the result from (46). 

Problem. Verify eqs. (46) and (47). 

4.20. Tensors. — In many physical problems, the notion of a vector is 
too restricted. For example, in an isotropic medium, stress S and strain X 
are related by the vector equation S = /cX, X and S having the same direc- 
tion. If the medium is not isotropic, S and X are not in general in the 
same direction; it is then necessary to replace the scalar k by a more 
general mathematical construct capable, when acting on the vector X, 
of changing its direction as well as its magnitude. Such a construct is a 
tensor. A similar generalization has to be made in the vector equations 

P = eE 


where P and P renresent electric nolarization and field stren crt.h 



where I and H represent intensity of magnetization and field strei 
anisotropic media, the susceptibilities c and ^ must be replaced hy 
Again, if it is desired to represent the displacements 5v of ti 
in a strained elastic medium as functions of their position ved 
tensor equation of the form = tv is needed, for dw and v differ 
tion, and the tensor i must effect this difference. This exampl 
treated in detail in see. 4.23; but first we shall discuss the a 
properties of tensors. 

Let us consider for complete generality a space of v dimens 
assume that two different reference frames are given so that a poi 
coordinates 7 in the first one are ( x 1 ^ 2 ,- * •, x v ) has the ccm 
(z l ,x 2 ,‘ • • y x v ) in the second system. Further let there be relatic 

- r(x\A- • • jf); = g m &\x 2 r . .,?) 

77b — 1, 2, 3, * * *, V 


so that we may transform from one system to the other. Then if 
ties (A^A 2 ,- * *,A") are related to v other quantities (A 1 , A 2 ,- • 
the equations 


v 8x m 

A m = Erji'; m = 1,2, 

1 = 1 ox 


they are said to be the components of a contravariant vector or a 
the first rank. To simplify the notation, it is customary to omit the 
tion sign and sum over indices which are repeated on the same si 
equation. An index which is not repeated is understood to tak< 
sively the values 1, 2, • * •, v, so that there are altogether v different e< 
With these conventions, we may rewrite (49) as 


dx m 

dJ 


A i 


A further word about notation should be added. Since a repeal 
(it is often called a dummy or umbral index ) indicates summation 
letter may be substituted for it at will. Thus (50) may also be wri 


dx m A . dx m . n 
= — j A 3 = — A n y etc. 
dx } dx n 


We will often use the same symbol such as A 1 to indicate both t 
and the i-th component of a tensor. No confusion should result 
arrangement. 

7 The upper suffix is not an exponent. Its position has an important mea 
aubfleauenfc discussion will show. 


163 


TENSORS 


4.20 


A covariant vector with components A m in one system and A m in another 
is defined by the relation 


T dx ' A 

Am ~ dx»‘ Ai 

(4-51) 

If (48) is differentiated we obtain 


dx m ■ 



(4-51a) 


hence we see that the components of an ordinary vector in ^-dimensional 
space are actually the components of a contravariant tensor of rank one. 
To find an example of a covariant vector consider a scalar point function 
(p(x m ) = (p(x m ). The components of the gradient of (p will be d<p/dx m and 

dtp d(p dx l 
35T ~ dx 1 dx m 


Thus the gradient of such a function is a covariant vector. The reader 
should not conclude, however, that a covariant vector is necessarily the 
gradient of a scalar. 

These ideas may be extended easily to define tensors of any rank. 
If <p(x m ) = ^(£ m ), we speak of p> as a tensor of zero rank or a scalar or 
invariant . There are three varieties of second rank tensors 8 defined by the 
transformations 


Jmn 


dx m dx n .. 
dx 1 dx^ 


(4-52) 


dx 1 dx j 
~ 3x m dx nA ' j 


(4-53) 


A 


’in 

n 


dx 171 dx j 
3? 35" 1 


(4-54) 


They are called contravariant, co variant and mixed, respectively. A use- 
ful mixed tensor of the second rank is the Kronecker delta 

C = 1; m = n 

= 0 ; m,y£n ( 4 - 55 ) 

This is seen as follows. Suppose 8} is this tensor in the coordinate system 
x'; then from (54) 



We thus see that <5™ has the same components in all coordinate 
Tensors of higher rank are defined by similar laws, for ex 
mixed tensor of rank four is 

r* 

npg dx i dx n dx v dx q ,kh 


It should be noted that if v is the number of dimensions of the c 
system, then a tensor of rank a has v a components. 

4.21. Addition, Multiplication and Contraction. — The sum or < 
of two or more tensors of the same rank and type is a tensor of 
rank and type. For example, if 


j^rnn _j_ j^rnn Qmn 


it follows from (52) that C mn is a tensor. It frequently happens 
components of a tensor satisfy the relation 

j^rnn __ 


such a tensor being called symmetric. On the other hand, if A mn ■ 
the tensor is skew- symmetric. When neither of these relations 
given tensor may always be written as the sum of a symmetric an 
symmetric tensor. To see this let us take 

S mn = £(A wn + A nm ); T mn = ^{A mn - A nm ) 
where A mn is neither symmetric nor skew-symmetric. Then 


The property of being symmetric or skew-symmetric is unaltere* 
tensor is transformed from one reference frame to another. 

An important relation exists between vectors and skew-s; 
tensors. Suppose C = A X B, where the components of C are 
(13). But the components of A (or B) form a skew-symmetri 
an = — aji, an = 0, where a i3 = A y , a 2 \ = A Z) a 32 = A x . We n 
ever, that if the vectors A and B were drawn in a left-handed c< 
system, their directions would both be opposite to those in a rigb 
system while C, their vector product, would have the same directio 
coordinate systems. 

The more common type of vector, such as that representing tr 
or a mechanical force, is often called a polar vector to distinguish 

P •wVn/’Vi Tiqo nrmencil Kzi/-! HP V. 104 -+ 


165 


ADDITION, MULTIPLICATION, AND CONTRACTION 


4.21 


an axial vector or a pseudovector , 9 requires the idea not merely of a displace- 
ment, but of some basic direction such as that implicit in a right-handed 
(or left-handed) coordinate system. A typical example of it is the vector 
product of two polar v 2 tors, like angular momentum or the moment of a 
force. A pseudovector in a three-dimensional Cartesian coordinate system, 
as we have seen, behaves in most respects like a proper vector but in the 
more general case it transforms like a skew-symmetric tensor. 

The scalar product of a polar vector and a pseudovector is called a 
pseudoscalar . 9 It differs from a true scalar, which must have the same 
magnitude in all coordinate systems, since it will change its sign if the 
direction of its coordinate system is changed. 


Problem. Show that in two dimensions a skew-symmetric tensor of second rank 
is a pseudoscalar and that one of third rank is impossible; in three dimensions, that a 
second rank skew-symmetric tensor is a pseudovector and a third rank tensor is a 
pseudoscalar. 


If we write a tensor in matrix form and compare it with eq. (10-16) it is 
clear that the components of the tensor are also the elements of a matrix. 
The only difference lies in the fact that tensors may always be written in 
matrix form if so desired, but the elements of a matrix do not need to 
transform in the same manner as tensors. 

If we multiply A™ by B n we obtain the mixed tensor A m B n = C%. It 
is easily seen that C™ tranforms like (54). This type of product, called the 
outer product , may be obtained with tensors of any rank or type; thus 
A™B pq = C™ pq . It should not be inferred, however, that every tensor can 
be written as a product in this way. Neither should we conclude that the 
outer product is the same as the vector product of sec. 4.5. 

Let us set m = q in the mixed tensor of (56) and write B np = A™ pm - 
To show that our notation, which indicates that A™ pm is a co variant tensor 
of rank two, is justified we use the transformation law (56), 


n _ j» 

v np - A npm ^ d _ n d _ p d _ m A ]kh 


• _ dx } dx k - 

dx’ 1 dx p ‘ jkh dx n dx p ' lki 


Comparison with (53) convinces us that A) ki is indeed a covariant vector 
of rank two since it transforms in the required way. This process of 
summing over a pair of contravariant and covariant indices is called 



4.21 


VECTOR ANALYSIS 


106 


contraction . It always reduces the rank of a mixed tensor by two, thus 
when it is applied to a mixed tensor of rank two the result is a scalar : 


dx m dx j 


A) = A\ = Al 


dx % dx m ■ 

When two tensors are multiplied together and then contracted, we speak of 
inner multiplication , thus 


jmno „ pm . 
J -'npq ~ ^pq 5 


A m B m = a scalar 


The last example is clearly equivalent to the scalar product in rectangular 
coordinates (cf. sec. 4.4), hence in tensor analysis, we say that if l is the 
length of A m or A m 

l 2 = A m A m (4-58) 

From (8) we conclude that the angle 6 between two vectors A m and B m is 
defined by 

. A m B™ 

C0S [{AmA™) (5 m B m )] 1/2 


and if A m and B m are perpendicular to each other, 


A m B m = 0 

We have just shown how new tensors may be obtained by addition, 
multiplication and contraction. We now inquire whether it is possible to 
change contravariant tensors to covariant ones or the reverse. Let g mn 
be any symmetric covariant tensor and g be the determinant of the com- 
ponents of g mn . Also let GF 171 be the co-factor 10 of g mn in then if we define 

gmn 

g mn _ (4-5 9 ) 

9 

it follows from the rules for the expansion of determinants that 

9mn9 pn = C (4-60) 

We would like to justify our notation and prove that g mn is actually a 
tensor. Let A n be a vector, then B m = g mn A n is also a vector, moreover 

g mn B m = g mn g mp A p = = A n 

so that g mn changes a covariant vector into a contravariant one; hence it 
must itself be a tensor. 

Two vectors related by the equations 


A m = g mn An 


167 


DIFFERENTIATION OF TENSORS 


4.22 


are called associated . It is often said that both are the same vector, A m 
being the contravariant components and A m the covariant ones. Tensors 
of any rank may be treated in the same way, thus 

A mn = g mp g nq A pq 

It should be clear that 

A mn B mn = A mn B mn \ A mn B pn = A m n B\ 

Because of the fact that dummy indices may be changed from one letter to 
another at will it follows that they enjoy a certain freedom of motion. 
They may be raised in one place if they are lowered in another. We have 
indicated this procedure in the last equation by spacing the indices. Such 
information is needed, for it is not true that 

= g mp A* n and A n m = gpm A** 

are identical unless A is a symmetrical tensor. 

4.22. Differentiation of Tensors. — It has been shown in sec. 4.20 that 
the derivative of a scalar point function is a co variant vector. The deriva- 
tive of a covariant vector is not a tensor, however, for if 


dA m _ d 2 x h dx h dA h 

dx n ~ ax n dx m h + dx m dx n 


d 2 x h dA h 

dx n dx m h + dx m dx n dx* 


(4r-61) 


and the presence of the second derivative shows that dA m /dx 11 does not 
transform like a tensor. In order to find a “ derivative 11 of the proper 
tensor character we first rewrite the second derivative in terms of first 
derivatives. To do this let us use the two tensors gij and g lJ defined previ- 
ously. Let us further introduce the following quantities (they are not 
tensors) called the Christoffel three-index symbols 



(4-62) 

(4-63) 


the significance of which will soon be evident. From these definitions we 



4.22 


VECTOR, ANALYSIS 


According to (53) , we have 


Qmri 


dx* dx 3 


dx” 1 dx n Qii 


and it is also true that 

dgq = dgji fa? 

dx q ax* dx 9 

Differentiating (65) and using (66) we get 

$Qmn 


dz q 


/ d 2 x l dx 3 ( dx 1 d 2 x J \ dx 1 dx 3 dx k dQij 

= Qij \d&&T ^ + dx q dx n ) ^d^d^d^Jx 1 


In the same way if we differentiate g nq and g mq we obtain 


dQnq 

dZ™ 


- »« (; 


aV 


dx m dx n dx q 


dx 3 


aV \ ax* ax- 7 ax* 7 agr# 




dg, 

dx 


dx n dxTdx 9 
dx 1 


+ 


dx m dx n dx 9 dx 1 


imq / a 2 x* ax 7 ari aV \ 
_ 6,17 \ax n ax m ax 9 + ax” ax n ax«/ 


ax* ax y ax* 7 dg ik 
+ ax m ax" ax 9 ax y 


We may exchange i and j in the second term on the right of these 
sions. If we add (68) and (69), subtract (67) and use eq. (62) we < 

. d 2 x l dx 3 dx 1 dx 1 dx k 

*- mn ’ 5 ~ 9ii a^ai" ai« + ii” ii* ai« y ’ 


where the bar over the Christoffel symbol indicates that it refer 
coordinate system x m . Now multiply this equation by g 9T {dx^/i. 
use (64), which gives 

, ax* a V „ dx 3 ’ ax* 


[ mn,r \ — - = g# 

' ’ * dx r dx m dx 




4 r . 


dx 9 dx r 


, _„ r dx* dx h dx 1 dx 3 
+ g dx 9 dx r dx m dx n 


By means of (52) we may eliminate g qr from the right-hand sid< 
equation to obtain 


.. d 2 z * dx 1 dx J ,, 


Finally, remembering that gijg 3h = = 6;, we see that 

dx' 


d 2 x h < % dx h 

— - — ~ = \mn.r\ — ; 
dx m dx n 1 1 ] dx r 


dx m dx 


% m 


Let us put this result into (61) which then becomes 



169 


TENSORS AND THE ELASTIC BODY 


4.S 

where we have changed the dummy indices in the last term from h and p t 
i and j. Finally we see from (51) that we have 


A r 


dx h 

dx r 


Ah 


so that (70) may be written 
dA m ~ 

dx n 

Now if we use the comma abbreviation 

dAi 


ij,h}A h 




it follows that 


dx 3 


j dx 1 dx 3 
m,n = iJ 


hence this quantity is a covariant tensor of the second rank. It is calle 
the covariant derivative of Ai with respect to gij. 

In a similar way it may be shown that the covariant derivative of A 
with respect to g# is 

dA { 

=-^j + {jh,i}A k (4-71 

Problem a. Prove that [mn f p] = g pq { mn,q \ . 

Problem b. Show that second derivatives of tensors may be derived in the fori 

Aijjt ~ - % A ih { jk, h } A.h. j\ihyh\ 

ox 

A"* = ~ + A^lhkJ] +A«|AJtf| 

OX 

Alt = ~* + Al\hk,i\ - AUiW 

4.23. Tensors and the Elastic Body. — As an example of the use of th 
tensor in a physical problem let us consider a deformable body subjecte 
to an infinitely small deformation or strain. Let P 0 be a point of th 
medium in the unstrained state and let P be its deformed position. If th 
coordinates of P 0 and P are x r 0 and x r then the components u r 0 of th 
displacement vector will be 

Wq-= X r - Xq = Uo(Xq,Xq.Xo) 


(4-72 


are v r 0 and v r , the coordinates of Q 0 and Q will be £& + fo and ^ 
It follows 11 that 

(x r + V r ) - (4 +t$) *= «o + (£)* 
and, on using (72), that 12 



The coefficients (du r /dx s ) q which relate the two vectors 8v r and Vq a 
components of a tensor. The terms (dv^/dx 1 ), (du 2 /dx 2 ), (du 


Q 



are tension strains parallel to the axes x 1 , x 2 , respectively, 

remaining terms are shearing strains about these axes; for exi 
(du 2 /dx l + du l /dx 2 ) is the shearing strain about the axis perpendici 
x l and x 2 . 

If the nine components of the tensor are written out it will be see 
it is not in general symmetric. However, it can be made so as sk 
sec. 4.21. Dropping the zero subscripts from (74) we write 

bv r = f 8 v* = e r a v* + o>y 

where 

t: = (du r /dx*); <=*(€ + *) 

- w* - €) i 

The coefficients e r s are now the components of a symmetrical tensor, 
is called a pure strain . It may be shown (see problem at end c 

11 The zero subscript on the derivative is meant to indicate that it is evaluate* 
point Po. 

12 This result only holds for rectangular coordinates. If (74) is to hold in gen< 

coordinates, we must use the co variant, derivative of i/ r 


section) that a > s represents a rotation of the neighborhood of Po about P» 
We could also add to (75) a translation by the amount a r , so that 

dv r = a r + e r s v s + of s v s 

represents the most general displacement of an elastic body, the total 
motion being composed of: (1) a translation, (2) a pure strain, and (3) a 
rotation. 

This briel discussion of tensors is entirely inadequate to indicate its 
great value in mathematical physics. The subject has been most fre- 
quently employed in the general theory of relativity. 13 It may also be 
applied with advantage in the study of dynamics, electricity, and hydro- 
dynamics. 14 The material presented here is sufficient for the use which 
will be made of tensors in this book. 

Problem. Show that the tensor <4 represents a rotation. 

Hint: Write out the components of (76) and it will be seen that the resulting vector 
is the vector product of two other vectors. 

13 Eddington. A. 8., “The Mathematical Theory of Relativity,” Second Edition, 
Cambridge Press, 1930. 

14 These subjects have been so treated by McConnell, A. J., “ Applications of the 
Absolute Differential Calculus,” Blackie and Sons, London, 1931, and more briefly by 
Thomas, T. Y., “ The Elementary Theory of Tensors.” McGraw-Hill Book Co., New 
York, 1931. See also Kron, G., “ Short Course in Tensor Analysis for Electrical Engi- 
neers,” John Wiley and Sons, New York, 1942. Tensor methods have been used to 
discuss the elastic properties of solids by Partington, J. R., “ An Advanced Treatise on 
Physical Chemistry,” Vol. 3, The Properties of Solids, Longmans, Green and Co., 
New York, 1952. 

REFERENCES 


in the following books the emphasis is on the mathematical aspects: 

Craig, H. V., “ Vector and Tensor Analysis,” McGraw-Hill Book Co., New York, 1943. 
Lays, H., “ Vector and Tensor Analysis,” McGraw-Hill Book Co., New York, 1950. 
Spain, B., “ Tensor Calculus,” Interscience Publishers, New York, 1953. 

Wade, T. L., “ Algebra of Vectors and Matrices,” Addison-Wesiey Press, Inc., Cam- 


bridge, 1951. 
Weatherburn, O. E., 


Elementary and Advanced Vector Analysis,” 2 vols., G. Bell, 


London, 1928. 

Vector or tensor methods are applied to physical problems in the following. 
Brillouin, L., “ Lea Tenseurs en M6chanique et en Elasticity” Masson et Cie, Pans, 
1938; Dover Publications, Inc., New York. 

Milne, E. A., “ Vectorial Mechanics,” Interscience Publishers, Inc., New York, 1948. 
Rutherford, D. E., “ Vector Methods,” Fifth Edition, Interscience Publishers, Inc., 


New York, 1948. 

Dyadics, which are particularly useful in many cases, are treated by: 
Gibbs-Wilson, “ Vector Analysis,” Yale University Press, New Haven, Conn^ 19-5. 
Morse, P. M, and Feshbach, H., “ Methods of Theoretical Physics,” Part I, Chapter 1, 
RrvnL' C! n New York. 1958. 


CHAPTER 5 

COORDINATE SYSTEMS 
VECTORS AND CURVILINEAR COORDINATES 

6 . 1 . Curv ilin ear Coordinates. — Although the methods of vector ai 
prove convenient in the statement of physical laws, it is usually nec 
to rewrite the vector equations in terms of suitable coordinates bef< 
final solution of a specific problem can be obtained. It is the pur] 
this chapter to show 1 how the components of vectors or vector op< 
may be formulated in a system of curvilinear coordinates, the lattei 
of so general a nature that it is an easy matter to transform from tl 
any one of the several kinds of special coordinate systems whicl 
been found useful in physical problems. 

In Cartesian coordinates, the position of a point P(x,y,z) is dete; 
by the intersection of three mutually perpendicular planes, x = 
y = const., z = const. When x, y and z are related to three new qu£ 
by the equations 

x = x(q u q 2 ,q 3 ) 

V = 1 /( 31 , 52 , 23 ) 

z = 2(31,32,33) 

with inverses, 

31 = 2i(z,2 /,*) 

32 = qz(x,y,z) 

33 = qs(x,y,z) 

a given point may be described by specifying either x, y, z or q lt q 2 . 
each equation of (2) represents a surface and the intersection of thr 
surfaces locates the point. The surfaces 31 = const., q 2 = 
33 = const, are called the coordinate surfaces; the space curves fori 
their intersection in pairs are called the coordinate lines. The coi 
axes are determined by the tangents to the coordinate lines at tb 
section of three surfaces. They are not in general fixed directions ii 
as is true for simple Cartesian coordinates. The quantities (31,32 
the curvilinear coordinates of a point P{x,y,z). 

1 The relations which we derive here may be obtained in other ways; see 
and Hobson, E. W., “ The Theory of Spherical and Ellipsoidal Harmonics,” Cc 


ice the square of the distance between two adjacent points, 
ds 2 = dx 2 + dy 2 + dz 2 = Qhdql + Q 2 22 dql + Qtzdql 
+ %Qi2dqidq 2 + 2 Qizdqidq 3 + 2Q 22 dq 2 dqz 


ere, 


^ da: dx dy dy dz dz 

q « f- H 

dqidqj dqidqj dqidqj 

«, - (?)' + (rT + (£)’ 

\dg z 7 Vdy*/ Vdgi/ 


(5-3) 

(5-i) 


r convenience we shall hereafter omit a repeated subscript, writing for 
tance Qi instead of Qa. 

The distance between two points on a coordinate line is called the line 
ment . It is given by eq. (3) when variation is limited to only one of the 


dsi = Q4qi (i =* 1,2,3) (5-5) 

Le direction cosines between these line elements and dx, dy or dz may be 
•anged as shown in Table 1 of sec. 4.1; for example, the cosine of the 
de between dsi and dz is (dz/ dq\) (dq\/ ds\) — ( dz/dq/)/Qi , and the cosine 
the angle % between dsi and dsj is 

COS — Qij/QiQj 

Le most useful coordinate systems are orthogonal ones, that is, systems in 
Lich surfaces always intersect at right angles. We shall limit ourselves 
such systems in secs. 5.2 to 5.15, returning to the more general 
se of non-orthogonal systems in sec. 5.16. For the present, then, 
3 = 0 , Qij = 0 , and the cross product terms may be dropped from 

). The three possible surface elements in orthogonal systems thus become 

dSij = dsidsj = QiQjdqidqj ( i,j = 1,2,3; i 3 ^ j) (5-6) 
d the volume element , 

dr = dsids 2 ds 3 = QiQ 2 Qzdqidq 2 dq^ 


(5-7) 


6.2 


COORDINATE SYSTEMS 


6.2. Vector Relations in Curvilinear Coordinates.— If 0 is a scalar 
function, V0 must be the same in all coordinate systems, for V<t> is a i 
whose magnitude and direction give the maximum space rate of chai 
0. A component of V0 is its directional derivative (see sec. 4.9) : 
given direction, thus the component perpendicular to the surface qj = 
stant and hence in the direction of is 

d0 1 d0 
dsi Qi dq { 

in accordance with eq. (o). Since it is also possible to regard V as a \ 
operator, it may be written in terms of unit vectors, Ui, u 2 , u 3 alon 
curvilinear coordinate axes. Thus, 


so that 


+ _U2__d_ U3 

Qi dqi Q 2 dq% Q3 dgs 

Ui ^ _j_ ^ _l_ d0 

Q 1 dqi Q2 d$2 Qz ^^3 


Any vector may be written in terms of curvilinear component 
V 2 , V 3 : 


V — U1F1 + U2F2 + U3V3 (I 


but in order to find V * V (see sec. 4.10) in curvilinear coordinates, we 
know the relation between u x , u 2 , u 3 and x, y , z. We proceed by evalu 
V * u t , starting with V X since this is needed to obtain V * u z . 

Remembering that u 1/Q1 is the product of a scalar and a vecto 
may write in view of (4-26) 

7x i- v (£> x ”‘ + s (vxui) 

- x v (£) + s (v x "■> (i 

the change of sign coming from the change of order in the vector pro 
From (9), we note that Ui /Q x = Vq x and from (4-30) that 


hence, 


V X Vqi = 0 


Ui x V 


(-)=- 

W Qi 


(V X U!) 


Now using (8) and performing the differentiation, w© find 


175 


VECTOR RELATIONS IN CURVILINEAR COORDINATES 


5.2 


When we further recall that 


(5-15) 


Uj X U; = 0 ; u* X u 3 - - Ui (5-14) 

and substitute (13) in (12), we obtain 

„ w u 2 dQi u 3 dQi /e 

V X «i = TTpr- 7TWT— (5-15) 

QiQs dtyz Q1Q2 

The scalar product of V and a unit vector may be written ns 

V • Ui = V • (u 2 X u 3 ) = u 3 • (V X u 2 ) - u 2 * (V X u 3 ) (5-16) 

by using (14) and (4-26). This becomes 

_ l ^(QsQs) ^ 


V-u, = 


Q1Q2Q3 d<h 


(5-17) 


when we expand the vector product by (15) and use the fact that 

u t - ■ Ui — 1 ; Ui * Uy = 0 (5-18) 

In order to determine V * V in curvilinear coordinates, we see from (10), 
that 

V * V = V * (uiFi) + V • (u 2 V 2) + V 0 (U3P3) 
a typical term becoming 

V • (u <Vi) = ViV ■ Uj + Uj • 7 Vi (5-19) 

by (4-26). When V * Uj is written in the form of (17), V Vi in the form of 
(9) and (18) used to eliminate the scalar products of the unit vectors, the 
three terms of (19) reduce to 


V- V = 


Q1Q2Q3 Idgi 


- (V 1 Q 2 Q 3 ) + 4- (V 2 QiQ 3 ) 


If V = V4>, 


+ — (VzQiQ?) 

dq 3 


V • V<j!> = V*4> = -J— 

QiQ2Qs\dqi L Q\ dffiJ 

QiQzd± l ^ d |~ QiQ 2 d± ]} 

dqzl Q 2 dq 2 J dq 3 [_ Qz dg 3 Jj 


(5-20) 


(5-21) 


since the components of V<j> are Vj = {d<t>/dqi)/Qi. 

The curl of a vector in terms of the unit curvilinear vectors becomes 



which may be expanded by using (11), to give terms like 
V X (u iVi) = 7i(V X VLi) - u» X (V7,0 


When three similar equations are added together, the result 
nantal form is 


V X V 


1 

Q1Q2Q3 


Qllll Q2U2 Q3U3 

a/^i a/ag2 d/ag 3 
V1Q1 V2Q2 V s Q 3 


In order to compute V 2 V in curvilinear coordinates, use is 
relation (4-32). 

V 2 V - V(V • V) - V X V X V 


which may be reduced to the desired form by means of (8), (2< 
The component of the resulting expression along the Ui directioi 


1 d 


Qi dqi 


(V-V) 


Q2Q3 dQz 


[Qs(v x v) 3 ] + 


Q2Q3 dqs 


[Q 2 (V XI 


where (V X V) 3 and (V X V) 2 are the components of V >< 
and u 2 . The two other components of V 2 V are obtained f 
cyclic permutation of the subscripts 1, 2, 3. 

The task of computing any of these vector quantities in s{ 
nate systems is seen to involve calculation of the Qi which ma 
a straightforward way from (4) provided relations like (1) or (2 
In the remainder of this chapter we discuss those special sy 
appear to be most useful. We include all those which may be 
the three-dimensional Schrodinger wave equation of quantur 
It has been shown 2 that the method of separation of variabli 
ter 7) is applicable to this equation only if the potential ene 
form 

V = Uf( qi )/Ql 

and the coordinates have certain special properties. Ther 
such systems; these are the ones described in secs. 5. 3-5. 9, 5 
the confocal ellipsoidal system of sec. 5.6 expressed in tern 
integrals. We indicate other examples of the use of some of tl 
we proceed. In each case, we describe the geometry, give 
between the new coordinates and x, y , z and list the resulting 

2 Robertson, H. P., Math . Ann. 98, 749 (1928); Eisenhart, L. P., Ph 
(1934); 74, 87 (1948). These coordinate systems have been discussed 
detail by Morse, P. M., and Feshbach, H., “ Methods of Theoretical Pt 


SPHERICAL POLAR COORDINATES 


6.4 


n (4). Calculation of V<t>, V i 2 $, V X V, etc., may be performed as an 
rcise by the student 3 (see problems in later sections). 

SPECIAL ORTHOGONAL COORDINATE SYSTEMS 

6.3. Cartesian Coordinates. — These form a trivial case of curvilinear 
rdinate systems. 

Ql = Ql = Ql = 1 ( 5 - 25 ) 

5.4. Spherical Polar Coordinates. — The coordinate surfaces are families 
(1) concentric spheres about the origin (r = const.), (2) right circular 
es with apex at the origin and axis along z (0 = const.), (3) half-planes 

Z 



i the Z-axis (<£ = const.). A point P(x,y,z) is located by specifying 

radius r of the sphere on which it lies, its colatitude 0 , and its longitude 

zimuth cf> on the sphere. From Fig. 1, it follows that 
x = r sin d cos <£ 


5.6 


COORDINATE SYSTEMS 


Remembering that dsi = Qadqi, values of the Qi may also be determ 
inspection from the figure, thus 

Qr * 1; <& - r 2 ; Ql = r 2 sin 2 0 

5.6. Cylindrical Coordinates. — The coordinate surfaces are: (] 
circular cylinders which form families of concentric circles abc 
origin in the XF-plane (p = const.); (2) half-planes from the 
(0 = const.); (3) planes parallel to the XF-plane ( z = const.). 




179 


CONFOCAL ELLIPSOIDAL COORDINATES 


6.6 


sheet (ju = const.); 
the equations 


(3) hyperboloids of two sheets (v = const.) given by 


* + — r + * 


a 2 — X b 2 — X c 2 — X 

* 2 , t f!_ 

a 2 — /x 6 2 — m n — c 2 


= 1 


- 1 


(6-30) 


a 2 — v v — h 2 


v — c 


= 1 


where X, are parameters called ellipsoidal coordinates ; a, 6, c are con- 
stants; a 2 >v>b 2 >pi>c 2 >\> — oo. It is shown in books on solid 
analytical geometry that intersections of these three surfaces are orthogo- 
nal and that all of them have common foci. Moreover, through any fixed 
point P(x,y,z) there passes one and only one surface of each type. 

The relation between the new and the old coordinates may be found by 
solving (30) directly. It may be done more easily as follows. Consider 
the cubic equation in a parameter q 


a 2 — q 


+ 


y 2 


b 2 


+ 


c 2 - q 


1 = 0 


(5-31) 


with three real roots, X, m, v satisfying the inequalities just stated. As q 
varies between a 2 and — °o , (31) describes the complete system of confocal 
surfaces given in (30). On clearing (31) of fractions and equating it to its 
identity, we have 

x 2 (b 2 - q)(c 2 - q) + y 2 (a 2 - q)(c 2 - q) + z 2 (a 2 - q)(b 2 - q) 

- (a 2 - q) (b 2 - q)(c 2 - q) m (q - X)( 9 - **)(g - *0 - 0 (5-32) 


and this must hold for every value of q. Upon setting q = a 2 , b 2 , c 2 in 
turn, we obtain 


2 _ ( g2 — X)(a 2 ~ m)(r 2 — v) 
X ~ (b 2 - a 2 )(c 2 - a 2 ) 

2 (fe 2 - X) (6 2 - m)(& 2 - v) 
V (a 2 - b 2 )(c 2 - b 2 ) 

2 = (c 2 - X)(c 2 - m)(c 2 - p) 
2 (a 2 - c 2 )(6 2 - c 2 ) 


(5-33) 


Taking the logarithm of (33), differentiating partially with respect to X 
and using (4), we have 

„ 2 _ If ( a 2 - M )( a 2 ~ *0 ( fr 2 - m )(& 2 - «0 

Wx Alfa 2 - \)(b 2 - a 2 ) (c 2 - a 2 ) + (b 2 - X)fa 2 - b 2 )(c 2 - b 2 f 



7 


COORDINATE SYSTEMS 


180 


allies for Ql and 
r c!ic interchange 
elds 


Ql may be obtained in a similar way or from (34) by 
of Simplification of the resulting expressions 


_ If (M-X)(y-X) 1 

Wx 4i(a 2 — \)(b 2 — A)(c 2 — X)J 

q 2 If Q-m)(X-m) 1 

^ 4l(a 2 -M)(& 2 -M)(c 2 -M)l 

q2 = 1 [ (X - »)Q - ») | 

4f(a 2 — v)(b 2 — v)(c 2 — v)\ 


(5-35) 


is somewhat laborious to transform (34) directly into the first equation of 
15) but their equivalence may be verified by writing the latter in terms of 
irtial fractions. 

Because of the fact that x, y and z appear as squares in (33), a given 
)int P(x,y,z) is not uniquely determined by (X,/x, v); in fact, eight points 
mmetrically located relative to the (XFZ)-axes correspond to the set 
.,p,y). This ambiguity may be resolved by adopting some convention 
nceming the signs of (X,p,j>), or in more elegant fashion by the intro- 
iction of elliptic functions. The latter procedure may be accomplished 
ther by means of elliptic integrals, Jacobian elliptic functions or Weier- 
rass p-f unctions. 4 

The confocal ellipsoidal coordinate system has proved useful in prob- 
ms of mechanics, potential theory, electrodynamics and hydrodynamics. 5 

5.7. Prolate Spheroidal Coordinates. — Degenerate cases of the preced- 
g system may arise if two or three of the axes in (31) become equal, 
dditional surfaces are then needed since the resulting equation in q is 
ther quadratic or linear. Instead of following a method similar to that 
;ed for ellipsoidal coordinates, it is simpler to proceed by considering the 
[uations of an ellipse and a hyperbola, 


z 2 x 2 

a 2+ a 2 (l - e 2 ) = 1 


z 2 x 2 

a 2 a 2 (el - 1) 


(5-36) 


4 Full details concerning these functions may be found in Whittaker, E. T. and 
atson, G. N., “ A Course of Modern Analysis,” Fourth Edition, Cambridge Press, 
27. 

5 Some references to these applications are: MacMillan, W. D., “ Statics and 
ynamics of a Particle,” 1927, “ The Theory of the Potential,” 1930, McGraw-Hill Book 

1 . . A Tl T?Ann/Jo + iAno nf Pn+anfiol ” T Snn’nnror Rorlin 1 QOQ • 


> 1, the eccentricity of the hyperbola. If we now substitute a cosh u 
a and sech u for e t in the ellipse, a cos v for a and sec v for e 2 in the 
>erbola and finally x 2 + y 2 = r 2 for x 2 , we obtain 


+ 


r 2 


a 2 cosh 2 u a 2 sinh 2 u 


= 1 


a 2 cos 2 v a 2 sin 2 v 


- 1 


(5-37) 


h. 0 < u < oo ; 0 < v < 7r. These equations represent the confocal 
dlies of: (1) prolate spheroids 6 (u = const.) and (2) hyperboloids (of 
► sheets) of revolution (v = const.) obtained by rotating the ellipses 
. hyperbolas of (36) around the Z-axis. The intersection of these 
: aces, as shown in Fig. 3, will be a circle of radius r; hence if 0 < $ ^ 2ir, 
addition of (3), a family of planes through the Z-axis (<f> - const.), to 
spheroids and hyperboloids gives us three suitable coordinate surfaces 
>,<£). We may then solve (37) for z and r and simplify the resulting 
ressions by means of the relations between trigonometric functions. 


ally, we set x 

= r cos <£, y 

= r sin 0, obtaining 



x = 

a sinh u sin v cos <j> 



y = 

a sinh u sin v sin $ 

(5-38) 


2 = 

a cosh u cos v 


[ from (4), 





Ql = Qt 

Ql 

= a 2 (sinh 2 u + sin 2 v) 

= a 2 (sinh 2 u sin 2 v) 

(5-39) 


An important property of prolate spheroidal coordinates makes them 
Ful in certain quantum mechanical problems. It is well known from 
lytical geometry that the sum of the focal radii of an ellipse is a constant, 
al to the major axis. Similarly the difference between the focal radii 
i> hyperbola equals the transverse axis. If r A and r B are the distances 
n the two foci to a point of intersection of the ellipsoids and hyper- 
lids, we find that 

r A + r B = 2 a cosh u; r A — r B = 2a cos v 

ire we have replaced a by a cosh u and by a cos v as before. This pro- 
ure thus locates a point relative to any two-center problem such as the 
bomic molecule (see sec. 11.21). It is often convenient to introduce the 

6 Also called ovary ellipsoids. 


5.9 COORDINATE SYSTEMS 

coordinates £ and n in place of cosh u and cos v, respectively, so that 


= r A + r B a ^ r A - r B 
2a ’ V 2 a 


In terms of these variables, the volume element may be seen to to 
form 

dr = a 3 (£ 2 — i) 2 )d%diqd<j> 

5.8. Oblate Spheroidal Coordinates. — When ellipses are rotatec 
their minor axis, the resulting surfaces are oblate spheroids. 7 If we 
(37) so that the axis of revolution is again the Z-axis, but now 8 th 
axis of the ellipse, we have 


+ 


a 2 cosh 2 u a 2 sinfa 2 u 


a 2 sin 2 v a 2 cos 2 v 

with 0 < u < oo , 0 < *; < 7T, x = r cos <j> } y = r sin <£, 0 < <$> <* 2n 
coordinate surfaces are thus: (1) oblate spheroids ( u = const.); (2) 
boloids (of one sheet) of revolution (y = const.); (3) planes throi 
Z-axis (<t> = const.). From (41), we find 

x — a cosh u sin v cos <f> 
y = a cosh u sin v sin <j> 
z = a sinh u cos v 

and from (4), 

Qt ~ Ql = a 2 (sinh 2 u + cos 2 v) 

Ql = a 2 cosh 2 u sin 2 v 

The geometry of the system may be inferred from Fig. 3 by suitabl 
change of the X-, Y- and Z-axes. 

5.9. Elliptic Cylindrical Coordinates. — If (37) is again rewrite 
x 2 in place of z 2 and y 2 in place of r 2 , the loci of these equations are c 
cal surfaces, whose elements are parallel to the Z-axis and perper 
to the XF-plane. Their intersections with this plane are ellipj 
hyperbolas. The coordinate surfaces are: (1) elliptic cylinder 
const.); (2) hyperbolic cylinders {v = const.); (3) planes paralle 

7 Also called planetary ellipsoids. The figures of the earth and of the plane 


CONICAL COORDINATES 


6.10 


l 

r-plane ( z = const.). Proceeding as before, 

x = a cosh u cos v 

y = a sinh u sin v (5-44) 

2=2 

Ql = Qv = a 2 (sinh 2 w + sin 2 v ) ; Q] = 1 (5-45) 

e intersection of these cylinders with the XT-plane may also be inferred 
m Fig. 3. 



6.10. Conical Coordinates. — A further degenerate case of the system 
3ec. 5.6 arises when the orthogonal sets of surfaces are: (1) spheres with 
Lters at the origin and radius u (u = const.); (2) cones with apexes at 
s origin and axes along the Z-axis (v = const.); (3) cones with apexes 
the origin and axes along the X-axis (w = const.), their equations being 


x 2 + y 2 + z 2 = u 2 



— y~ l « o 

v 2 — b 2 c*-v* 

-JL Z 1 = o 

b 2 — w 2 c 2 — w 2 


(5-46) 


and from (4) 

02-1; Qf = 


X 2 = 

V*- 

Z 2 = 


u 2 v 2 w 2 

b 2 c 2 


u 


2 (v 2 - b 2 )(w 2 - b 2 ) 


6 2 (6 2 - c 2 ) 

W 2 ( tf 2 _ c 2^( w 2 _ c 2 ^ 


c 2 (c 2 - b 2 ) 

U 2 (v 2 — w 2 ) U 2 (v 2 ~~ w 2 ) 


(v 2 - b 2 ) (c 2 - *; 2 ) 


: Qi- 


(w 2 — b 2 )(w 2 — c 2 ) 


5.11. Confocal Paraboloidal Coordinates. — A system similar to 
sec. 5.6 has coordinate surfaces consisting of confocal families 
elliptic paraboloids extending in the direction of the negative 
(X = const.); (2) hyperbolic paraboloids Ox = const.); (3) elliptic \ 
loids extending along the positive Z-axis (v = const.). The eqi 
for the surfaces are 


X 2 

+ 

y 2 

a 2 — X 

b 2 - X 

X 2 


y 2 

a 2 — n 


ju — b 2 

X 2 

0 

+ 

y 2 

T O 


v — ar v — b 


“b 2z -j- X = 0 
•f 2 z fi =0 
— 2z — v — 0 


where — oo < \ < b 2 < u < a 2 < v < -foe. Proceeding as in ft 
focal ellipsoidal system, we may write the cubic equation in q, 

x 2 y 2 

“2 "f" 72 b 2z + q =0 

a 2 - q b 2 - q 

with three real roots, X, \i, v. As q varies between — oo and + 00 , th 
plete system of confocal surfaces (49) will be described. On elearii 
of fractions and equating it to its identity, we have 

x 2 (b 2 - q) + y 2 (a 2 - q) + (2 z + q)(a 2 - q)(b 2 - q ) 

— (9 ~ ^)(9 ~ m )(9 — v ) = 0 

Expressions for x 2 and y 2 may be obtained from (51) by setting q = 
i> 2 in turn; the result for z is found by equating the coefficients 0 : 



JPAKAJBUJU1U UUUlti/lJN ATiliS 




sides of (51). We thus have 

2 _ (a 2 - X)(a 2 - M )(a 2 - v) 
X ~ (6 2 — a 2 ) 

2 (b 2 - X)(b 2 - m )(6 2 - v) 

V (a 2 - b 2 ) 

z = -|(a 2 + & 2 — A - y — v) 


(5-52) 


2 1 (/x — X)(y A) 

Vx 4 (a 2 — X)( 6 2 — X) 
2 ^ 1 (v ~~ ji)(\ ~ n) 

^ 4(a 2 - M )(6 2 -M) 

0 2 = 1 (X - v) (M ~ ^) 
^ 4 (a 2 - *)( 6 2 - v) 


(5-53) 


tuse of the appearance of x and y as squares in (52), a point P(x } y,z) 
^spends to four points P(\,y } v) symmetrically located with respect to 
XZ- and FZ-planes. As in the confocal ellipsoidal system (sec. 5.6) 
ambiguity may be removed by the use 9 of elliptic integrals. 

.12. Parabolic Coordinates. — If two roots of (50) become equal, the 
*dmg method fails since there are now only two surfaces. In this case, 
ider the families of parabolas 


x 2 = 2 ?{z + £ 2 /2) 

x 2 = -27? 2 (z - 7J 2 /2) 


(5-54) 


vertices of all parabolas lie on the Z-axis at distances — £ 2 /2 and 
, respectively, and all of them have a common focus at the origin of the 
esian coordinate system. If we now rotate these parabolas about the 
is, the resulting intersections are circles and the paraboloids of revolu- 
are still given by (54) if we replace x 2 by r 2 = x 2 + y 2 , x = r cos 4>, 
r sin <t>. We thus obtain 


from (4), 


X = £77 cos </> 


y = sin <f> 

(5-55) 

z = (v 2 ■ 

- ?)/2 


Ql = Ql = 

(I 2 + 1 ? 2 ) 

(5-56) 

Ql = 

«v 


See, for example, Maxwell, J. C., “ A Treatise on Electricity and Magnetism," 


The coordinate surfaces are: (1) paraboloids of revolution extendin 
direction of the positive 2-axis (£ = const.); (2) paraboloids of Te^ 
extending toward the negative Z-direction (?? = const.) ; (3) planes 
the 2-axis (4> = const.). Intersections of these surfaces with t 
and XF-planes are shown in Fig. 4. Parabolic coordinates 
used in the treatment of the Stark effect. 10 



Fig. 5-4 


5.13. Parabolic Cylindrical Coordinates. — A system similar to c 
cylindrical coordinates is obtained by adding planes to the p 
cylinders represented by (54). If we replace z by y in those eg 
we have 

£ = fc? 

y = (v 2 - $ 2 )/2 

Z = Z 

«f -Q? = (* 2 + u 2 ) 

Q\ = l 

10 Schrodinger, E., Ann. PAysi* 80. 457 (1926); Epstein, P. S., Phys. Re t 



187 


BIPOLAR COORDINATES 


5.14 


The coordinate surfaces are: (1) parabolic cylinders (£ = const.); (2) 
parabolic cylinders (y - const.); (3) planes (z = const.). The intersec- 
tion of these surfaces with the .XT-plane is like the system of confocal 
parabolas shown in Fig. 4. 

5.14. Bipolar Coordinates.— Before considering this system, we list a 
few relations which are needed in the subsequent discussion. In terms of 
exponentials, we may write 


sin x = - (e lx — e lx ) ; cos x = ^(e “ + e™) 

& 

J sin x i(l — e 2lx ) 

tan x = = 

cos x (1 + e 2lx ) 

Replacing x by ix, we have the corresponding hyperbolic functions 


(5-59) 


sin ix = - {e x — e x ) = i sinh x 
cos ix = \(e x + e~ x ) = cosh x 


(5-60) 


tan ix 


i(e 2x ~ 1) 
(e 2x + 1) 


= i tanh x 



Fig. 5-5 


We shall also need the inverse circular function tan - " 1 x = u. 
x = tan u, it follows from (59) that 


^2 iu _ i! £) . 

(i + x) J 


2m 


(i + x) 


Since 


and 


(5-61) 



Suppose a point P(x,y ) is located as shown in Fig. 5 by means ( 
vectors i*i and r 2 and two angles &j and 6 2 . For different positions 
point in the XT-plane, the vectors are always drawn from the fixed ; 
A and B symmetrically located on the X-axis a -distance 2 a apai 
p = x + iy; p* = x — iy, then 

X = (p* + p)/2; y = ^ (p* — p) < 


The coordinates of the point are 

p — a = 

P + a = r 2 e^ 2 

and from the geometry of Fig. 5, it follows that 

r\ = (x — a ) 2 + y 2 ; Q\ = tan" 1 2//(x — a) 

r| = (x + a) 2 + y 2 ; 0 2 = tan" 1 y/(x + a) 1 

Defining new quantities 

£ = 0i — 0 2 ; v = In- ( 

and dividing the two equations of (63) by each other 


where 


P + a __ i p 

= e zx ; - 

p — a a 


e~ ix + 1 
- 1 


x 


£ + i v 


( 


In order to find x and y as functions of £ and rj, substitute (66) an< 
in (62). When use is made of (59) and (60) the results are 


a smh rj 

x _ — 

cosh 7] cos £ 

a sin £ 

^ — 

cosh 7] — cos £ 


To find the form of the coordinate surfaces, we start from the defi 
of £ and use (61) to obtain 

__ ± . j n (ix ~ ia + y) (ix + ia - y) 

2 ^ (ix — ia — y) (ix -f - ia + y) 

which may also be written as 

(1 4- 

x 2 + y 2 ~a 2 + 2ia^) r f-^=0 


189 


BIPOLAR COORDINATES 


§.14 


We observe from (59) that the last term of this expression equals 
'-2m//tan g = —2 ay cot g. Hence 

x 2 + y 2 — a 2 — 2ay cot g =0 
or 

x 2 + (t/ — a cot g) 2 = a 2 (l + cot 2 g) = a 2 esc 2 g (5-69) 
In the same way we find 

2 , _ r \ ( x + a ) 2 + y 2 
6 r\ (x - a) 2 + y 2 

and 

(a; — a coth r]) 2 + y 2 = a 2 csch 2 rj (5-70) 

We thus see that for g = const., 0 < g ^ 27 r, we have a family of circles 
with centers on the F-axis at the point, x = 0, y = a cot g, the radii of the 
circles being a esc g. Each member of this family will pass through the 



Fig. 5-6 


fixed points A and B as shown in Fig. 6 and will intersect the circles 
rj = const, orthogonally. The members of the second family have radii 
of length a csch rj and are all situated on the X-axis at the points 


6.16 


COORDINATE SYSTEMS 


7j = — oo. When rj = 0, the circles degenerate into points on the 
The position of a point in the XT-plane is thus fixed when we k] 
which quadrant it lies and furthermore the constant values of 17, £ 
circles which pass through it. Since the fixed points A and B (that 
X-axis) divide each circle of the set £ = const, into two segments, w 
trarily take £ = £ 0 < ir for the arc above the X-axis and £ = £o + 7r 
points below this axis. 

In order to use these circles as a coordinate system in space, ii 
them to be moved along the Z-axis. Then ( 69 ) and ( 70 ) represe 
families of right circular cylinders with axes parallel to the Z-axis. 
able coordinate surfaces are then: (1) cylinders with centers on the 
(£ = const.); (2) cylinders with centers on the X-axis (r? = const, 
planes perpendicular to the Z-axis (z = const.). From (68) and (4 

O 2 = O 2 = — 

* v (cosh rj — cos £) 2 

a 2 = i 

Bipolar coordinates are useful 11 in problems of hydrodynamics an 
tricity. 

6 . 16 . Toroidal Coordinates. — If we rewrite ( 69 ) and ( 70 ) with 
stituted for y 2 and r 2 = x 2 + y 2 for x 2 , the resulting equations 

2az cot £ = r 2 + z 2 — a 2 
4 a 2 r 2 coth 2 t? = (r 2 + z 2 + a 2 ) 2 

represent the families of spheres and tores (or anchor rings) obtai 
rotating the circles of the previous system about the Z-axis. If we 
the third surface planes through the Z-axis, ^ = const., then 

y/x ~ tan \j/ 

The orthogonal coordinate surfaces are thus: (1) spheres with cen 
the axis of revolution at distances ±a cot £ from the origin and radii 
(£ = const.); (2) anchor rings or tores, whose axial circles ha-v 
a coth i7 and whose cross-sections are circles of radii a csch rj (17 = < 
( 3 ) planes through the Z-axis (\f/ = const.). The spheres and anch 
have a common circle, r = a, z = 0 . With methods similar to the 


191 


TOROIDAL COORDINATES 


6.15 


in sec. 5.14, 


x = r cos y — r sin ^ 
a sinh r\ 

r = — - 

cosh t\ — cos | 

a sin £ 

2 = : 

cosh 7] — cos | 

Ql = Q, 2 = t — r — — r?2 

(cosh 7) — cos 


<4 = 


a 2 sink 2 17 
(cosh rj — cos £) 2 


(5-74) 


(5-75) 


This system has found application 12 in certain problems of electricity and 
of potential theory. 


Problem a. Show that in spherical polar coordinates: 

VV = -j-l — {sin 6 ■— (r 2 7 r ) + r ^ (sin 8V 0 ) + r^4 
r 2 sin 6 l dr de * d<t> j 

1 f • 9 /,a\ , 9 / - 9 \ , x 3 ! 1 

V 2 = ~r ~. — <sm fir (r 2 - J + Hsmtf- N -J 

r 2 sin 6 i dr \ dr/ d&\ ddj sin 0 d<f > 2 j 

(V X V) r = — {| (sin f>7*) - 

r sm 0 ld& d<p J 

„ 1 fdF r . d(rF*)l 

(V X V) e = {-r— — sm 0 — - — f 

6 r sm 0 l cty dr j 


(V X V)* 


_ m 

r \dr 


(rVe) 


_ dVr\ 

de J 


Problem b. Show that in cylindrical coordinates : 




l d 2 a 2 \ 
+ p d* 2 + P dz 2 \ 


Problem c. If V is the potential energy and m is the mass of a particle show that 
Vewton’s laws of motion become: 

(1 ) in spherical polar coordinates 

m [r — r0 2 — r sin 2 Qi ? % } = —dV /dr 


m{i|(r¥) 

f 1 d 

m \ 

[r sm 6 di 


— r sin 0 cos 0<p 2 
(r 2 sin 2 0^)} = - 


_ldV_ 
r d0 

1 dV 
r sin 0 d<p 


5.16 


COOKDINATE SYSTEMS 


(2) in cylindrical coordinates 


m(p - pep 1 ) = - — 
dp 



dV 

mz — - 

oz 


HON-ORTHOGONAL COORDINATE SYSTEMS 

6«16o Tensor Relations in Curvilinear Coordinates. — When the 
nate surfaces of a curvilinear system are not orthogonal, the met] 
tensor analysis prove convenient (see sec. 4.20 ff.). The relations 
we are about to derive are more general than those obtained in t 
part of this chapter; in fact, we will show that the two formula! 
the problem become equivalent for orthogonal coordinates. 

Let (x 1 ^ 2 ^ 3 ) be the usual Cartesian coordinates of a point and (q 
be its curvilinear coordinates, as discussed in sec. 5.1. Then in 
notation, 13 eq. (3) becomes 

ds 2 = g { jdq l dq 3 

where 

dx m dx m 

9ii ~ dq i dq’ ~ 9ii 

is identical with Qij of eq. (4). The line element is clearly 
dsi = ’S/ r giidq t ; (i not summed) 

In order to find the surface element, we recall from sec. 4.5 tha 
face may be represented as the vector product of two other vectors, 
let ds 2 be an infinitesimal displacement at the point (q l ,q 2 ,q s ) ak 
coordinate line q 2 and ds 3 be a similar displacement along the 
Then the vector dS 1 == ds 2 X ds s is perpendicular to the plane q 1 = 
and its magnitude dSi is the desired surface element in that plane, 
we can obtain the appropriate expression for it in terms of the tei 
we must digress in order to consider two important systems of ve< 
curvilinear coordinates. Suppose r = r (g 1 ,? 2 ,? 3 ) is a vector and 


193 


TENSOR RELATIONS IN CURVILINEAR COORDINATES 


5.16 


is a small displacement. If we define three vectors 

dr 



then it is clear that we may write 

dr = e{dq l (5-76) 

These vectors, e, which we call base vectors, 14 are directed tangentially 
along the coordinate curves but they are not necessarily of unit length. 
While it is usually more convenient to resolve an arbitrary vector A into 
components which are multiples of a unit vector we may also write 

A = a l ei 

and the three scalars o l are the contravariant components of A. 
define another set of base vectors 

! e 2 X e 3 2 e 3 X ei 3 ei X e 2 
e 1 = ; = ; e = 


(5-77) 
Let us 

(5-78) 


where v is the scalar triple product [e x e 2 e 3 ] of sec. 4.6b. These vectors are 
perpendicular to the planes of e 2 , e 3 ; e 3 , e x and ei, e 2 , respectively, and it 
is easily seen that 


e™ * e n = C 


(5-79) 


Furthermore, it is true that 

e 2 X e 3 
©1 = ? } e 2 = 


e 3 X e 1 


e3 ~ — where v r — [e 1 e 2 e 3 ] and vv r = 1; 


(5-80) 


hence the two sets of vectors e m and e n are said to be reciprocal to each 
other. In terms of the reciprocal set 15 (76) becomes 


and (77) becomes 


dr = e'dqi 
A = a l e < 


(5-81) 

(5-82) 


where the are the covariant components of A. 


14 The systems of base vectors introduced here are treated by matrix methods in 



If we equate (76) and (81) we obtain 

dr = &idq l = e*dqj 

and if we multiply by e l or ej we find, because of (79), that 
dq % = e l • e J dqj; dqj = e ; - • e{dq l 

Since the square of the distance between two points is given 
we see from (83) that 

ds 2 = e» • Qjdq l dq J ' = e l • e J dqidqj 

We may therefore identify the scalar products of the base 
tensors g x * and gij 

9*j = • e y ; g ij = e* • e y 

For later use, we also note that we may equate (77) to (8 

A = a z -e* = a J Qj 
and use (79) and (86) to write 

di = a J ' - g ij cn 

We also have from (87) the equivalent expressions 
a t * = A • e 2 *; = A • e J 

hence (87) may be stated in the alternative form 
A = (A • e.V - (A • e j )ej 

We now have several relations by means of which we ma; 
contravariant or covariant components of an arbitrary ^ 
wish to know the components in terms of unit vectors, t 
coordinate lines, we recall the equation defining the len 
(sec. 4.4) and see that the appropriate unit vectors are 


u* - 




Therefore, any vector A may also be written as 


A = AiUi 

where 

Ai = Vg’-ed 


If needed, similar equations could be given in the reciproca 
Let us now return to the problem of the surface eleme 
coordinates. Since dSi = e l dq t , we have 

<2Si = dso X dso 5= (eo X eAdcPdn 3 


195 THE DIFFERENTIAL OPERATORS IN TENSOR NOTATION 5.17 

and 

dSx = [(02 X e 3 ) . (02 X e 3 )] l,2 dq*d<? 

It is easy to show (see Problem a, sec. 4.6) that the scalar product inside 
the brackets becomes 

(e 2 • e 2 )(e 3 • e 3 ) - (e 2 • e 3 )(e 3 « e 2 ) 

Thus when we use (86) we obtain 

dS x = v^ 22 flf 33 ~ ghdq 2 dq 3 (5-6a) 

Similarly, surface elements on the planes q 2 = const, and q 3 = const, are 

dS 2 = Vg^33 “ 9izdq 1 d(f 

dS 3 = Vg n g 22 - g\ 2 dq l dq 2 

The volume element, 

dr = dsi * ds 2 X <is 3 = [eie 2 e 3 ]dq l dq 2 dq 3 

If we place A = e 2 X e 3 and use (89) we get 

A = e 2 X e 3 = [e 1 e 2 e 3 ]e i + [e 2 e 2 e 3 ]e 2 + [e 3 e 2 e 3 ]e 3 

Now by means of (4-18) and (78) we eliminate e 1 to obtain 


[eie 2 e 3 ] = e 2 • A 


ei 

[eie 2 e 3 ] 


{(e 2 X e 3 • e 2 X e 3 )ei 


+ (e 3 X ©i • e 2 X ©3) e 2 + ( e i X e 2 • e 2 X ©3)^3 } 

Finally we expand the scalar products within the brackets using again the 
result of Problem a, sec. 4.6 and getting 


[eie 2 e 3 ] 2 = e! • e x [(e 2 • e 2 )(e 3 • e 3 ) - (e 2 • e 3 )(e 3 • e 2 )] 

+ e x • e 2 [(e 2 • e 3 )(e 3 • e x ) - (e 2 * eO(e 3 • e 3 )] 

+ e x • e 3 [(e 2 • e x )(e 3 • e 2 ) - (e 2 • e 2 )(e 3 * e x )] 


By means of (86) we may replace the scalar products in this equation by 
the gij , finding that [eie 2 e 3 ] = Vg where g is the determinant of the com- 
ponents of Qij and the volume element becomes 

dr = V~g dq 1 dq 2 dq 3 (5-7a) 


5.17. The Differential Operators in Tensor Notation. — We have seen 


COORDINATE SYSTEMS 


196 


iq i /e i or, since e l = g lJ ej, we have in curvilinear coordinates 


V<p = 


a d<p 
g 3 e, — -• 

y J dq l 


(5-9a) 


The divergence of a vector V in terms of its contravariant components is 
die covariant derivative. Thus, from (4-71) 


V • v = v\ 




(5-90) 


STow according to (4.63) 


(...) \ ik . dQjk dQi: 

M -is (57+ v 


-ift dffjfc _ ki dQj* 
y dq i dq k 

ince we may exchange the dummy indices i and k. Moreover, g lk = 
nd Qij = 9ji so we may cancel the second and third terms in 
ij,i} . Finally, we refer to (4-59) and the rule for differentiating determi- 
Lants (see sec. 10.4) to prove that 

7T = G<i = ag i] ' 


The Christoffel symbol therefore takes the form 

lii A = 1 0 ik^9ik = 1 dg_ = g(Vg) 
‘ J ’ 9 dq } 2g dq j Vg dq 1 ' 

5q. (90) may thus be written as 
v . V . 

dg - 7 


(5-20a) 


A. similar expression may be obtained in terms of covariant components 
>f V. 

If V = V<f>, the contravariant components of the gradient are 
V* = V • e* by (88) and by (9a) 


197 


THE DIFFERENTIAL OPERATORS IN TENSOR NOTATION 


5.17 


Substituting this result in (20a) we find for the Laplacian 

Vgdq'L dq 1 J 


(5-21a) 


The final expression we wish to derive here is the curl. Define the 
covariant tensor of rank two 

v..- d -X±-*Zi 

dq j dq { 


If we transform to a new coordinate system we see from eq. (4-53) that 


d_l_ dq_ 
3q m dq n 1 


This tensor which is invariant to such a transformation is the curl in 
curvilinear coordinates. According to its definition, it is skew symmetric, 
hence the only non-vanishing components are Vi 2f V 2 3 and Vsi. In terms 
of the base vectors we write 


V X v = y 12 (e 1 X e 2 ) + F 23 (e 2 X e 3 ) + 7 31 (e 3 X e 1 ) 

We have shown in (78) how to convert the e l into the reciprocal base 
vectors and we have also proved that v — [eie 2 e 3 ] = Vg. With these 
changes, the curl of a vector V is* 


-V X V = 


Vg LI d? 2 



+ 




®i + 


dVi 57,) 

dq 3 dq 4 62 


(5-22a) 


It is a simple matter to see what happens when the coordinate surfaces 
are mutually orthogonal. In that case, the vectors ei, e 2 , e 3 are also 
perpendicular to each other and e l is parallel to e*. Moreover, 



and e x * e 2 = e 2 • e 3 = e 3 • ei = 0. It thus follows that gij = 0 unless 
i = j; in the latter case, 

qu = ur 

2 

Remembering that gu is then identical with as used in the first parts 
of this chapter, equations such as (3a), (4a), etc., in secs. 5.16 and 
5.17 will reduce to the corresponding equations which appeared earlier 
in this chapter without the letter a. 


Drnhl am 


T)ari to Kv fV»a f.Anonr mafL nr! f.lia racnl+.Q nf ’PmUlama a K n nf con K 1 K 


CHAPTER 6 

CALCULUS OF VARIATIONS 


One of the elementary problems of the differential calculus h 
the maxima and minima, that is, the stationary values, of a functi 
The necessary condition for the occurrence of a stationary value i 
is that y (a) — 0. Sufficient conditions that it shall be a minirn 
maximum are, respectively, y" (a) > 0 and y rr (a) < 0. The ca] 
variations deals with a similar, but a more complicated problem 
finding a function y(x) such that a definite integral, taken over a 
of this function, shall be a maximum or a minimum. The simpler 
this calculus, to which this chapter will be primarily devoted, d 
the necessary conditions that the integral shall be either a maxiir 
minimum; in other words, that it shall have a stationary value; su 
considerations as well as criteria for establishing the maximum or n 
character of the solutions are not important in many physical appl 
For these, the reader should consult the more comprehensive tre; 
the subject listed at the end of this chapter. 

6.1. Single Independent and Single Dependent Variable. — i 
desired, then, to find that function y(x) which will cause the integrc 



I (x,y>yz)dx 


to have a stationary value. The integrand I is taken to be a fur 
the dependent variable y as well as the independent variable 
y x = dy/dx . The limits X\ and x 2 are fixed and at each of them, 
fixed value. The integral over I takes on different values along 
paths connecting the points (xi,yi) and (# 2 , 2 / 2 ); one °f these 
labeled Y(x) in Fig. 1. We assume that it is either largest or 
along y(x), for example. The paths Y ( x ) which are admitted for c 
son shall be “ adjacent ” paths covering a small neighborhood 
stationary path t/(#), that is, Y (x) — y{x) shall be infinitesimal for a 
of x between xi and x 2 . 

We define: 

8y(x) ee Y(x) - y(x) 

81 - I(x,Y,Y x ) - I(x )y ,y x ) 


199 SINGLE INDEPENDENT AND SINGLE DEPENDENT VARIABLE 


6.1 


The symbol 8 is called variation; it represents the increase in the 
quantity to which it is applied as we pass from the stationary path to the 
comparison path at the same value of x. Thus, clearly 8x = 0. Further- 
more, 


8 


dy 

dx 


dY 

dx 


dy 

dx 




(6-3) 



This shows that the symbols 8 and d/dx “ commute.” Since Y and y are 
adjacent, it follows from (2) that 


81 ~ I(x, y + by, y x + 8y x ) - I{x,y,y x ) 

dl dl , . 

= — *y + — fyx (6-4) 

dy dy x 


In words, the formal rules for computing variations are the same as those 
for computing differentials. 


In terms of this notation, the condition that 



Idx be stationary is 


easily written down. It is simply that the integral along y shall yield the 
same value as that along y + by, 



(6-5) 


This is of course the analogue of the condition in the ordinary calculus 
that y{x) be stationary, i.e., dy = 0. With the use of (3) and (4), eq. (5) 
becomes 


.1 


CALCULUS OF VARIATIONS 


200 


[lie second term of the integrand yields after partial integration 


IX: 


d dl 


dl 


] hydx + 
ax dy x / L dy x . 


Jut the integrated part vanishes at both limits because 8yi = 8y 2 = 0. 
lence the stationarity condition becomes 


C (— ~ 

J X1 \dy 


d_ dl_\ 
dx dyj 


hydx = 0 


( 6 - 6 ) 


iHiile the vanishing of an integral does not in general imply that the inte- 
rand is zero, we may nevertheless conclude here that 


dl d dl 
dy dx dy x 


(6-7) 


'his is because the parenthesis in (6) is multiplied by an arbitrary though 
ifinitesimally small function of x, namely dy. For if the left-hand side of 
0 were not zero for every x, it would have to be positive in some range and 
egative in another range in order to satisfy (6) with a positive 8y. We 
tay then choose 8y to be positive where the left side of (7) is positive and 
sgative elsewhere, an arrangement which would violate (6). Hence 
(7) follows and is the condition we are seeking. A function y which 
dishes that differential equation is called an extremal. Among these 
ctremals the minimizing or maximizing curve y will be found, provided 
exists. 

Eq. (7) was hrst derived by Euler; it is called the Euler equation 
ssociated with the variation problem. It may be written in a different 
>rm: 



hich is useful when I does not depend explicitly on x, for then (7a) shows 
tat 


_ dl 

I — y x — = const. 
dyx 


presents an extremal. 
f noting: 


The identity of (7) and (7a) is at once established 

dl dl dl , dl 
dx = Tx + yx 7y + Vxx ^ x 


raromlpe 


201 SINGLE INDEPENDENT AND SINGLE DEPENDENT VARIABLE 


6.1 


provides a formal proof of this assertion. The element of distance in 
Cartesian coordinates is given by ds 2 = dx 2 + dy 2 . Hence 


p> x 2 n x 2 

J ds = J (1 + yl) ll2 dx 


If this is to be a minimum, Euler’s equation (7), with I = (1 + yf) 1/2 
must be satisfied. Hence 


d d 
dx |_dy a 


(1 + 


yl ) 112 =o 


or 


. = const. 

+ yl 

which means dy/dx — const. 

The minimizing curve is the straight line passing through the points 
2 /i and y<>. Had we chosen polar coordinates, the problem would have 
been to find r as a function of <p such that 


s 



(dr 2 + r 2 d<p 2 ) 112 = 


J (r 2 + r%) ll2 cLp 


is stationary. The Euler equation then reads 

T d T 

VTW 2 ~ d * V 2 + 4) 1/2 = 0 

This reduces to 

rr w ~ 2i% - r 2 _ 

(^ + 4)3/2 - u 

The expression on the left is simply the curvature of the curve in polar 
coordinates; hence the result is the same as before. 

The element of distance on the surface of a sphere of radius a is given by 

ds = a(dd 2 + sin 2 6d(p 2 ) 112 


If we wish to find <p as a function of 6 such that s is stationary, we must solve 
(7) with / = (1 + sin 2 0<p 2 ) 1/2 : 


d 


\~ 


sin 2 6(p e 


. 1-0 


ddla + sin 2 ^ 2 ) 1/2 J 
When the bracket is put equal to a constant, c, we get 

c cosec 2 8 


<Pe = 


(1 — c 2 — c 2 cot 2 8) 1/2 



6.1 


CALCULUS OF VARIATIONS 


a and k being new constants. To interpret this result we \ 
tesian coordinates, using z = a cos d. We have a c cot 0 = 
or, on multiplying by sin 6 , 

kz = x sin a — y cos <2 


This represents a plane passing through the origin and he 
surface of the sphere in a great circle. The shortest (and a 
distance between two points on the surface of the sphere h 
great circle connecting them! 

b. The Brachistochrone . 

A problem which held the fascination of mathematics 
decades of the 17th and 18th centuries is that of finding the 
an object, in the absence of friction, will slide from one 
another in the shortest (brachistos) time (chronos). 
proposed the problem in 1696; both he and his brother J 
Newton and Leibnitz, found the correct solution. Th 
happens to be a cycloid, is known as the brachistochrone. 

Let the particle start from rest at the origin; the termii 
motion is ( 222 / 2 )- In working this problem it is convenien 
F-axis to the right and to measure x downward. Then fro 
of conservation of energy, 

\mv 2 = mgx 


where v is the velocity of the particle at any point of its path 
g the acceleration of gravity. Hence, since 


ds Vote 2 + dy 2 
dt ~ dt 


dt 


(1 + 2/* 2 ) 1/2 
(2gx) 112 


The integral to be minimized is therefore 


' /2 "‘ - X ’(“ir)" * 

Euler’s equation reads 

A. y± — . 0 

dx [x(l + d)] 112 


Hence 


203 


SEVERAL DEPENDENT VARIABLES 


6.2 


If we introduce the constant 2 a = 1/c, integration leads to 

y = a cos" 1 — (2ax — x 2 ) 112 + c f (6-8) 

But the new constant of integration, c , must be zero in order to make y 
vanish at x = 0. Eq. (8) represents the equation of an inverted cycloid 
with its base along Y and its cusp at the origin* (Cf. Fig. 2.) The con- 
stant a must be so adjusted that the cycloid passes through the point 
(^ 2 ^ 2 ). The path will also be a cycloid if we allow the particle to fall with 
a finite initial velocity, as the reader may verify. 


Y 



c. Minimum Surface of Revolution. 

The soap film problem discussed in sec. 2.2i may also be solved by the 
method outlined above. Whatever the function y , the surface generated 
by revolving y about the X-axis has an area 


2ir 



2tt fya + yi) in dx 


If this is to be a minimum, eq. (7a) requires that 

2/(1 + 2/i) 1/2 - yyl ( i + dr 112 = (i" + V yi-)ii2 


an expression which is identical with our former solution of the soap film 
problem. 

Problem. Solve Example c with the use of equation (7). 


6.3 


CALCULUS OF VARIATIONS 


we shall suppose that the integrand I occurring in tl 
minimized or maximized is a function of one indepen 
dependent variables. In almost all examples relevant to 
independent variable is the time, while the coordinate 
In view of this fact we shall modify our notation, using 
former x, and x, y, z, etc., in place of the former y. W< 
functions x(t), y(t), z(t), • • • which make the integral 


f 


I ( t,x,y,z , ■ • • Xt,y t) zt, ■ ■ •) dt 


stationary. 


The Euler condition is desired as before; we 
oldi = 0 

'it 


s: 


But in this case 


_ dl dl dl 

51 = — tx + —ty + — tz + 
dx dy dz 


dl , dl 
+ — 5x t + — 5y t 
dx t dy t 


In computing the integral (9) we again perform partial ii 
second group of terms; for example: 


r h dl 

r h di d , 

dl h / 

I — hxidt = 

1 — - (5 y)dl = 

— 5x - 

/«, e )x t 

J h dx t dt 

Ldx t J t J, 


As before, 5x vanishes at both limits. 


J h LVdx dt dxj \dy dt dy t J 


Hence (9) become 
d_dT 
dt dy t J 


+ 


(dl __d__ dP\ 
\dz dt dz t ) 


8z + 


If 5x, 8y, 8z are entirely arbitrary and independent func 
the parentheses occurring here must vanish separately, 
in place of the one Euler equation (7), as many as th< 
variables: 


dl 

dx 


d dl 


3J _ iai = 0# 
dt dx t 9 dy dt dy t 9 


dl 

dz 


d dl 


dt dz t 


— 0 ; 


6.3. Examnle : Hamilton’s Principle. — The elements 


205 


EXAMPLE: HAMILTON S PRINCIPLE 


6.3 


mental ideas, particularly the energy concept, have been proposed through- 
out the history of the subject. The most important of these is Hamilton’s 
principle. It should be regarded not as a consequence of Newton’s laws 
of force (although it can be shown to be consistent with them) but as a 
parallel fundamental postulate of mechanics which may be useful in cases 
where Newton’s laws are cumbersome in their application. The principle 
takes for granted a knowledge of the kinetic energy, T , of the mechanical 
system as a function of the coordinates and their derivatives, and also of 
the potential energy, F, as a function of coordinates and possibly the time. 
From the functional form of T and V it then permits the deduction of the 
coordinates as functions of the time. 

The principle postulates that the integral 



shall have a stationary value. The integrand, T — F, is called the Lagran - 
gian function. We shall consider only conservative mechanical systems, 
that is, systems for which V is a function of the coordinates only. 

Let us first treat the motion of a simple mass point in three dimensions, 
using rectangular coordinates for its description. Then 


and, 


T = \m{x 2 t + y] + 4) 
V = V (x,y,z) 


so that 

I = \m{x\ + yf + zf) — V 


Eqs. (10) are then seen to be Newton’s laws of motion: 


d 

dt 


(: mx t ) 


dV 
dx ’ 


d 

dt 




dV 
dy ’ 



dV 

dz 




An advantage of Hamilton’s principle becomes apparent when the 
problem is such that another system of coordinates is more natural for its 
solution. In that case Newton’s laws require the transformation of the 
force components to the new coordinates, which is sometimes inconvenient, 
while the scalars T and V are more easily transformed. Thus consider the 
motion of a particle in a central field of force, that is, V = V(r). Using 
polar coordinates we have 



6.3 


CALCULUS OF VARIATIONS 


Dependent variables are r and <p. Hence Euler’s equatic 

d , v 2 dV 
~ (■ mr t ) - mr<pi « - — 
at dr 

= 0 

The first of these is the well known radial equation c 
planetary motion ( — dV /dr = const./r 2 ): the term mn 
centripetal force, which appears automatically in th 
second equation is Kepler’s second law for it states that r" 
Its meaning is obvious when it is remembered that the a 
the radius vector is ^r 2 (d<p j dt) . 

Turning now to the consideration of more complicated 
containing more than one mass point, we first introduc 
nates, q u #3, • • • q n , where n is the number of degrees of 
be a function of the q' s, but it will not depend on the 
energy, T, however, will be a function of the g’s as well a 
when Cartesian coordinates are used). 

Hamilton’s principle then states that 

J r* 1 2 

S[T(qiq 2 ■ ■ ■ q n ,quq2i ■ • • q n t) ~ V (q x ■ ■ ■ q n 

tl 

Eqs. (10) become 1 

d qi dtdq, 1 ~ l 

These are the famous Lagrangian equations of motion, 
Lagrange (without the use of the calculus of variations). 

To illustrate their applicability and also the use of g( 
nates we discuss one further example taken from the fi< 
If q is the charge and i = q t the current in a simple c 
capacitance C and self-inductance L, its total energy at an; 
shown to be 

hWt + i ^ 

It is clear from the foregoing remarks that the first oi 
may be regarded as kinetic energy T, the second as pc 


207 


SEVERAL INDEPENDENT VARIABLES 


6.4 


of T and V here becomes lost, as it does in many problems of advanced 
dynamics. Lagrange’s equation for the present case takes the form 




and this will be recognized as the differential equation describing the 
natural oscillations of an electrical circuit having no resistance. More 
complicated examples of the application of Lagrange’s equations to electri- 
cal and indeed even thermal phenomena are available. 2 

Problem. For a simple harmonic oscillator, V = ffcc 2 . Use Hamilton's principle 
to obtain its equation of motion. 


6.4. Several Independent Variables. — Next, it is necessary to extend 
the simple theory so as to permit the integrand to contain several inde- 
pendent variables. The problem then is to find a function u(x,y,z) such 
that 



I {x.y^^Ux.UyyU^dxdydz 


is stationary. Here we are treating x, y f z as independent variables, u as the 
one dependent variable, and we define again: u x = du/dx y etc. As before, 
we require 


J I J Sldxdydz - 0 


( 6 - 12 ) 


Here 8u represents the increment incurred in the passage from the 
extremal u to some neighboring function U , x, y, and z being held fixed. 
Hence 8x = by = bz = 0. Therefore 


dl dl , < 9 / dl 

81 = — du 8u x -4 8u y H 8u z 

du du x ou y du z 


In evaluating an integral like J* J* J* &u x dxdydz we first perform the 
integration with respect to x, obtaining 


and (12) reads 


A 

J XI du x ax 


budx 




d dl a dl 

dx du x dy du y 


A -L\ 

dz du z ) 


budxdydz 


2 See Thomson, J. J., “Applications of Dynamics to Physics and Chemistry," 



dl 3 dl d dl d dl 

du dx du x dy du y dz du z 

If, in addition to u, there are other dependent varial 
is augmented by other equations in which u is replac 


Examples* 

a. Let us find the function u(x,y,z) which has a ir 
of the square of its gradient in a certain region of 
requirement seems artificial at first sight, it is never 
significance in electrostatic and quantum-mechanica 


us 


(yu) 2 dxdydz 


is to be stationary, I = + u z (cf. Chapter 

the operator V), and (13) becomes 

Uxx I ^ j yy + U'zz == V U == 0 

This is Laplace’s equation which must be satisfied 
electric potential in free space. (Cf. Chapter 7.) 

b. Vibrating String. Let a string of length l be ur 
it executes small vibrations, it suffers the displaceme 
to its length, which will be taken along x. For any < 
l\ and 

l' = f vTT uldx 

Jo 


If the distortion is small, the integrand may be expa 
so that 

V = i i f u 2 jx 

Jo 


The potential energy of the entire string will then be 



provided the tension F is not changed by the smal 
The kinetic energy is, clearly, 



ACCESSORY conditions; lagrangian multipliers 


6.5 


) 


n represents the mass of the string per unit length, considered constant, 
imilton’s principle now states: 

(\mu 2 t — \Fu x )dxdt 



ill be stationary. The two variables, x and t , are here to be regarded 
the independent ones. The Euler equation (13) for this case is the 
ve equation: 

F 

Utt — Uxx 

m 


6.5. Accessory Conditions; Lagrangian Multipliers. — Problems some- 
Les arise in which it is necessary to make an integral stationary while at 
* same time one or more integrals involving the same variables are to be 
)t constant. A typical example, discussed below, is that of finding the 
sed plane curve of given perimeter and maximum area. This example, 
ng one of the earliest to engage mathematical interest, has given this 
5S of problems the name “ isoperimetric.” 

In general, the presence of accessory conditions can be dealt with by 
ans of “Lagrange’s method of undetermined multipliers,” as follows. 

■ wish to find the stationary value of 

/«- 

>vided that 



Cl, 




(6-14) 


. jPs contain the same variables; the limits are fixed and identical in all 
egrations, and the integrations may be multiple; in the latter case dr 
nds for a product of differentials. The c 7 s are understood to be constants. 
We introduce a set of n constant parameters, Xi, X 2 , • • • X n , the values 

which are not at once specified. It is clear that, if J Idr is stationary, 

.ere K = I + X x / 1 + X 2 I 2 + * • • + X n /n, is also stationary whatever 
* values of the X’s, because of (14). We are thus confronted with a 
)blem similar to the foregoing, the minimization (or maximization) of a 
crle integral, but with a modified integrand: I must be renlaced bv 


6.5 


CALCULUS OF VARIATIONS 


as were outlined in sec. 6.1, we arrive at the equivalent of eq. ( 
I is now replaced by K. But the passage from (6) to (7) is now 
because by is no longer an arbitrary function: the variations 
accord with the relations (14). One may -say that by has lost r 
freedom. But here the unspecified character of the X’s cor 
rescue. They are precisely n in number and can be so adjust* 
parentheses vanish. 3 Hence the transition from eq. (6) 
permitted in this case as well. The extremals must satis 
equation 

dK d dK ^ 

= 0 

dy dx dy x 

or its equivalent (7a). If there are several dependent and ii 
variables, eqs. (10) and (13) take the place of (15). 

In solving Euler's equation the A's which are now presumabl 
unknown appear as constants in the extremals. They may be 
formally by means of conditions (14), but their meaning can 
recognized more directly at some stage of the solution. 

Examples. 

a. To find the plane curve of fixed perimeter and maximum 
seek that r(<p) which maximizes 


and has a fixed 


Here 

so that (15) reads: 

r + Ar(r 2 + r 2 ^)“ 1/2 - y- [A/>(r 2 + r£)~ 1/2 ] = 0 

dip 

This leads to 

ny - 2 r% - r 2 1 

"iPn^) 3/a " x 

The left of this equation will be recognized as the curvature, 
curve. This is to be constant, hence the curve is a circle ^ 
p = X 


X 2t 

r 2 d<p 

s = f 2 > + -4) 1/2 ^ 

J 0 

K = |r 2 + X(r 2 + r 2 ) 1 ' 2 


211 


ACCESSORY conditions; lagrangxan multipliers 


6.5 


b. To prove that the sphere is the solid figure of revolution which, 
for a given surface area, has maximum volume. The area is 

A = 2 t J yds = 2?r J y( 1 + y 2 x ) l,2 dx 

the volume: 



Therefore 

K = y 2 + Xy( 1 + y 2 ) 1/2 


since we are here permitted to drop constant factors. As K does not con- 
tain x explicitly, it is convenient to use (7a) instead of (7) or (15) : 


whence 


~ dK 
K - y x ~~ 

dy x 



y 2 + Xy(l + yl) in - \yyl( 1 + y 2 )“ 1/2 = e 


But clearly, y = 0 at x = 0 and at x = a, which can only be true if c = 0. 
Hence 

y 2 + Xy(l + ylr 112 = o 

or 

2/ = — x(i + ytr m 

Solving this for y x we obtain 


dy 

dx 


y 


which on integration leads to 

-Vx 2 - y 2 = x - x 0 
or, 

(x - x 0 ) 2 + i / 2 = X 2 

We note that the figure is a sphere with center on the X-axis at x 0 and of 
radius X. 

It is possible to work this problem without the use of Lagrangian 
multipliers by means of an ingenious method due to Euler. He uses in 
place of the independent variable x a new one, £, which measures essentially 
the area of revolution formed by the arc y(x) between x = 0 and the 



6.5 


CALCULUS OF VARIATIONS 


212 


Iii terms of this variable, 


11/2 


so that 


— Kf>-*T 

V = x f b y(l - fa 2 ) 112 ® 


(6-16) 


Here b represents the value of £ when x *= a that is, the given area divided 
by 2t. By keeping b fixed the accessory condition A = const, is auto- 
matically satisfied. This method, while very elegant, cannot be applied 
generally. 

The stationarity condition for (16), if written in the form (7b), yields 

»d - *M) W * + vM<i - - »a - vMr 1 " - « («-») 


whence 


After integration, 





The new constant d must be 1 if the curves are to pass through £ = y = 0. 
To obtain the result in terms of x and y, we substitute for in (17) the 
value obtained by solving 

i h = 2/(1 + yl) ll2 ye 


Eq. (17) then reads 


2/(1 + Vx) 112 = c 


and this is precisely the equation solved above with — c = X. 


c. Wave equation. In sec. 6.4 we have seen that Laplace’s equation is 
the necessary condition that the average of the square of the gradient of a 
function shall have a stationary value. If the same quantity is to be made 


stationary, but with the additional requirement that f u 2 dxdydz shall have 


a fixed value, another interesting equation results. In that case 



213 


SCHRODINGER EQUATION 


6.6 


The integral to be minimised is therefore + \Ii)dxdydz. Euler’s 
equation [in the form (13)] then reads 


d 2 u d 2 u d 2 u 
dx 2 dy 2 dz 2 


\u 


« 0 


which is a special form of the wave equation, namely, that describing 
sinusoidal waves of a single frequency. (Cf. Chapter 7.) Such a wave 
may therefore be characterized as a disturbance in which the displacement 
u has a fixed mean square value and at the same time a minimum square 
gradient. 

6.6. Schrodinger Equation. — The fundamental equation of quantum 
mechanics (sec. 11.9) can be derived from a variation principle, as will 
now be shown. We define an operator, known as the Hamiltonian operator, 
as follows : 

H s ~kV 2 + V(x,y,z) 


The physical meaning of k is seen from the relation k = h 2 /Sir 2 m where h is 
Planck’s constant and m the mass of the particle whose motion is con- 
sidered; V is its potential energy. We now seek a function \p, possibly 
complex, which satisfies the following two conditions: 


in 


\p* (Hxp)dxdydz 


(6-18a) 


shall be stationary; 


/// 


p*\pdxdydz — 1 


(fr-i8b) 


The integrations are taken over fixed domains of x , y , and z. It will be 
supposed, furthermore, that the permissible functions \p and \p* either 
vanish sufficiently strongly at the boundaries of the volume of integration, 
or take on the same values and derivatives at corresponding points on 
opposite boundaries. 

When this is true, the following transformation may be made: 


/>^ds = 

>*T- 

/ dx 2 

_ dx] Xl 


The integrated part vanishes. As a consequence 



C&U.U. uivjii \ioji ) i JU-bj uc muumcu vxj icj-u. 


Iff 


S[k(Vt*) • (V*) + V(z,y,z)f*f]dzdydz = 0 


The function K which appears in Euler's equation [(15) but generalized in 
accordance with (13) to take care of the fact that there are now three 
independent variables] is 


K = h(yp*\}/ x 4- 'Pfyy + 4't'Pz) + V4'*'!' — W’V 
Euler's equations are (^* and \p are both dependent variables !) 


dK 

d dK 

d dK 

d dK 


dx d\p x 

dy dip y 

dz d\p z 

dK 

d dK 

d dK 

d dK 

dip* 

dx dip* 

dy dip* 

~ dz dip* 


They reduce to 

4“ ^ yy 4“ Tpzz) 4" V\{/ — (6-19) 

and a similar equation for To identify the constant X, we note that 
eq. (19) may be written 


If we multiply this equation by and integrate over x , y } z the left side 
becomes the stationary integral (18a), which will be denoted by E. The 
right is X in view of (18b). Hence X = E. With this substitution for X, 
eq. (19) is Schrodinger’s equation. 

This result is worth summarizing. Schrodinger's equation serves the 


purpose of selecting the extremals ^ which make 


us 


\p* (H*p)dxdydz 


stationary, provided J J J \p*\pdxdydz is held constant, 
constant is unity, then, J J j* ip*(H\p)dxdydz is the energy 


If the latter 
which appears 


in the Schrodinger problem. Further inspection shows the energy to be a 
minimum rather than a maximum in most cases of physical interest. Upon 
these results is based one of the most powerful methods of obtaining 
approximate solutions of eq. (19V (Cf. sec. 11.18.) 

6.7. Concluding Remarks. — In concluding this chapter, we note a few 
possible generalizations of the theory given here. In the first place, one 
may remove the restriction that By = 0 at the limits of integration. This 
means, with reference to Fig. 1, that the curves y(x) and Y (x) do not have 



215 


REFERENCES 


the same termini. The integrated term which appears in the partial inte- 
gration leading to eq. (6) will then no longer vanish, and there arise three 
conditions in place of eq. (7): 

di d di ^ ran 

= 0 ; - =0 

dy dx dy x LdyxAx, 

The second and third of these then serve to fix the arbitrary constants in 
the solution of Euler's equation. 

A further generalization is needed when the limits x\ and x 2 themselves 
are no longer fixed. Whenever this happens, introduction of a new 
parameter, in terms of which both x and y may be expressed, reduces the 
problem to the forms here discussed. 4 The Principle of Least Action 
involves a variation problem with variable limits. Since Hamilton's 
principle is in general more powerful the former, in spite of its historical 
interest, will here be omitted. 

When the integrand / involves higher derivatives than the first, no 
great complications arise. The Euler equation then contains additional 
terms. 5 The point where our simple treatment has been most deficient is 
in its omission of all considerations establishing the actual existence of 
maximizing and minimizing curves. It will be recalled that Euler’s 
equations are merely necessary conditions. They furnish no assurance 
whatever that the curves sought are indeed present among the extremals. 
For these more mathematical questions we refer the reader to the treatises 
by Bolza, Bliss, and Kneser. 



REFERENCES 

Other good general texts on the calculus of variations are: 

Bliss, G. A., “ Calculus of Variations,” Open Court, La Salle, 1925. 

Bolza, 0., “ Vorlesungen liber Variationsrechnung,” Teubner, Leipzig, 1909. 

Forsyth, A. It., “ Calculus of Variations,” Cambridge University Press, 1927. 

For modern applications, see 

Lanczos, C., “ The Variational Principles of Dynamics,” University of Toronto Press, 
1949. 

Morse, P. M., and Feshbach, H., “ Methods of Theoretical Physics,” McGraw-Hill 
Book Co., 1953. 

Wentzel, G., “ Quantum Theory of Fields,” Interscience Publishers, Inc., New York, 
1949. 

4 See Byerly, W. E., “ Introduction to the Calculus of Variations,” Harvard 

University Press, 1917. 



CHAPTER 7 


PARTIAL DIFFERENTIAL EQUATIONS OF CLASSICAI 

7.1. General Considerations. — The general theory of pari 
tial equations is well beyond the scope of this book and will not 1 
in a systematic way. 1 Attention will here be limited to a sma 
partial differential equations which are of frequent occurrenc 
of which may be resolved by a powerful method known as th 
of variables. Before we proceed to consider specific examples 
few remarks about the meaning and variety of the solutions ar 
The simplest type of an ordinary differential equation, tha 
order, has a general solution which contains one arbitrary co: 
metrically it may be interpreted as a set of plane curves labeled 
values of the arbitrary constant. In particular, if the equat 
there is but one curve passing through a given point, and thi 
specified when the value of y for some value of x is prescribed. 

The simplest type of partial differential equation is oi 
independent variables ( x and y), and the dependent variable 
linear and of the first order. Its solutions represent, geomet 
of surfaces constructed over the X-Y plane. The question m 
Is one such surface uniquely determined by requiring that it in' 
point? If this were true, the manifold of solutions of th< 
equation, z(x,y), would reduce to a single surface when it is s 
the solution shall contain that point. 

This, however, is not the case. For consider the sim] 
dz/dx + dz/ dy = 0. It is clear that any function of the form z 
will satisfy it. This function is not uniquely determined t 
point of it. The origin, for instance, is contained in 
z = c{x — y) } and yet every different value of c defines a diffi 
Neither does a prescribed curve fix a surface. For let it 
that the solution z of the partial equation above shall pass thr 
x = y in the X-Y plane. This is certainly accomplishe< 
z = (x — y) n , but there is an infinite number of such surfac< 


217 


LAPLACE S EQUATION 


7.2 


in dealing with the solutions of partial differential equations we are con- 
fronted with a variety of functions which far transcends the degree of 
complexity encountered in connection with ordinary differential equations. 
In fact one must not be surprised to find that the complete geometric 
specification of a solution of a partial equation even of the simplest type 
usually requires the fixation of an infinite number of parameters. 

7.2. Laplace’s Equation. — An equation which arises in almost all 
branches of analysis is Laplace's: 

V 2 F - 0 (7-1) 

Its intuitive meaning was discussed in the chapter on the calculus of varia- 
tion (sec. 6.4), where eq. (6-1) was shown to be equivalent to the postulate 
that V shall have the least mean gradient. The function V satisfying (1) 
may be said to be the “ smoothest ” of all functions. This is obvious when 
Laplace's equation is solved in one dimension, for then it simply reads: 
d 2 V/dx 2 = 0 and has as its solutions all straight lines. 

To indicate briefly the range of application of eq. (1) we state three 
instances in which it occurs : 

a. A fundamental theorem of function theory states: 

Let z = x + iy; then the function f(z) takes the form 

f(z) = u(x,y) + w(x,y) 

wherein u, v, x, y are all real; if and only if the functions u and v satisfy: 

V 2 u = 0, V 2 v = 0 

b. In sec. 4.12 it was shown that the velocity v of an indestructible 
fluid, as a function of space coordinates and the time, must be a solution 
of the equation of continuity, which reads 

^ + V • (pv) ==0 
ot 

If the fluid is incompressible, its density p is constant, and the equation 
reads 

V • v = 0 

If, furthermore, the motion is irrotational, the velocity vector is the gradi- 
ent of a scalar function V, known as the velocity potential: v = — VV, 
and the equation of continuity thus becomes equivalent to Laplace's: 

V 2 v - 0 . 

c. The electrostatic potential in a region of space not occupied by 
charges satisfies Laplace's equation. 


^independent variaoies j are quite amerent; moreover, tnat tne iorm c 
the solution even for the same number of dimensions will be different ii 
different systems of coordinates. 

7.3. Laplace’s Equation in Two Dimensions. — a. Rectangular Coordi 
nates . The equation reads : 


d 2 V d 2 V 
dx 2 + dy 2 


= 0 


(7-2 


A method, not of universal applicability but suitable for this particula 
problem, involves the transformation to a new set of independent variables 


£ = x + iy, y = x - iy 


In terms of these, 

dx 2 d£ 2 dt-dy dy 2 ' dy 2 <3£ 2 d£dy 


so that 


V 2 V 


A d 2 V 
4 

djjdy 


= 0 


d l. 

dy 2 


Clearly, this equation admits both V =/(£) and V = f(y) as solution 
hence 

v = Zi(£) +f 2 (v) =fi(x + iy) +f 2 (x - iy) 

where /i and/ 2 are any two functions which are twice differentiable. Th 
reader will hardly fail to see the connection between this result and th 
statement above concerning the functions of a complex variable. 

For many problems another form of solution, obtainable by the metho 
of separation of variables, is more satisfactory. Let us make the assumj 
tion, justifiable by its success, that V may be written in the form 

V = X(x) • Y(y) (7-2 

where X and Y are functions of only one independent variable, x and j 
respectively. When (3) is substituted in (2) there results, after divisio 
by V, 

X" Y" 

— + — => 0 (7-4 

an equation in which primes denote differentiation of a function wit 
respect to its own variable. If (4) is to have a solution at all, then eac 
term on the left must separately be equal to a constant; for a change in 
would not alter the value of Y" /Y, and a change in y would not affec 


LAPLACE S EQUATION IN TWO DIMENSIONS 


) 


7.3 


7 X . One may therefore conclude: 


X" 

X 


k 2 , 



(7-5) 


ere the constant parameter k 2 , written in this form for convenience, 
y have any value, real or complex. These are two ordinary equations 
ich may easily be solved by the methods of Chapter 2. Eq. (5) leads 
once to 

X = c x e ±kx , Y = c 2 c ±iky 


nee a solution of (2), characterized by a given value of the parameter k, 
1 be 


V k = c k e± k{x±iy) 


(7-6) 


Lee (2) is a linear equation, a sum of expressions like (6) is also a solution, 
nee a more general solution is 

V - £c k e ±k(x±iy) 

k 


wen 


V = j c{k)e ±k(x±iv) dk 


(7-6a) 


r the value k = 0 the result is of a more special form. Eq. (5) then leads 


that 


X = a\x + &2> Y = biy + b 2 
V = axy + cx + dy +■ e 


(7-7) 


uch of the solutions, (6), (6a), or (7), is to be chosen depends entirely on 
> nature of the problem at hand. (Cf. examples.) 

b. Polar Coordinates. Laplace’s equation reads : 


d 2 V ldV 1 d 2 V 

dp 2 ^ p dp p 2 dtp 2 


(7-8) 


ng again the method of separation of variables, we put 

v = P 0>)*M 

ien this is substituted into (8) there results, after multiplication by 

r, 


7.4 


PARTIAL DIFFERENTIAL EQUATION! 


Here the first two terms -are independent of (p, the thin 
Hence we may write 








= -fc 2 


The solution of the first equation is at once seen to be 
second, = e^ tk<p Hence 


Vk = c k p ±k e ±ikv 


or, more generally, 


V = Tc k P ±k ^ 

k 


For k = 0, (9) becomes 

, p' 

P ;/ «| = 0, = 0 

p 


When integrated once, the first of these yields P ; = 
integration P = oi In p + a 2 . On the other hanc 
$ = + h 2 . Hence a particular solution is 

V = (ai In p + + ^ 2 ) 


Again, further information must be available before a 
results can be selected as a suitable solution of a given j: 


7.4. Laplace’s Equation in Three Dimensions. 2 — a. 
nates. An application of the method of separation of 


a 2 F 
ax 2 + 


d 2 V 

dy 2 


, d 2 V ^ 
+ ~ 2 ~ = 0 
dz 2 


follows precisely along the lines of sec. 7.3a. We put 
and obtain 


X 




- 0 


Each of these terms must separately equal a constant, 
constants (which we write as fcf, k 2i must vanish. ^ 

= «*»+*»**•», k\ + kl + k 2 z 

If k x , k 2 , or kz is zero, the corresponding factor in (12) 
aix + 02 , etc. A more general solution would be 

v = E c kl k 2k /'*+ k »* k “ 

k\k$k$ 


221 


LAPLACE S EQUATION IN THREE DIMENSIONS 


In this connection it is sometimes convenient to regard hi, k 2 , formally 
as the components of a vector k. Eq. (12) may then be written 

7 k =c( k)eV \k\ = 0 

b. Cylindrical Coordinates. In accordance with the results of Chap- 


dp 2 p dp dz 2 p 2 dtp 


V - P (p)Z{z)H<p) 

substitute, and divide by V. The result is 

P" 1 P' 1 Z” n 

t + ; p + ?t + t = 0 

Clearly, the last term on the left must be constant; let us put it equal to 
-Jc 2 . Then 

Z = Cl e ±ikz (7-13) 


The remaining equation, 


1 P^ 1 

P p P p 2 $ 


k 2 = 0 


when multiplied by p 2 , separates again into two equations: 

— = -l 2 , p 2 P" + P P' - (fc 2 p 2 + l 2 ) P = 0 


The first has the solution 

$ = c 2 e± il * 

the second turns into J Bessel’s differential equation (2-57) when the sub- 
stitution ikp = x is made, for it then reads : 

d 2 V dP 

The solution of this equation was discussed in sec. 2.14. It will here be 
denoted by Z j. Collecting these results we have 

Vti = c kl Zttkp)e±« kz±M (7-14) 

When l = 0, = cq <p + a 2 ; hence we obtain as another solution of 

lesser generality than (14) the expression: 


PARTIAL DIFFERENTIAL EQUATIONS 


1A 

When k = 0, Z — ( b\z + b 2 ) instead of the function (13). 
tion for P takes the form 

P 2 P" + pP' - Z 2 P = 0 

which was already encountered in sec. 3b. It has the solution ^ 
V 0 , = P ±l (biz + 6 2 )e ±,V 

Finally, when both l and k are zero, the solution may be seen 
form 

Poo = Oh In p + a 2 )(biz + b 2 )(ci<p + c 2 ) 


The most general function satisfying Laplace’s equation is a si 
of solutions (14)- (14c). 

c. Polar ( Spherical ) Coordinates. As was shown in the 
coordinate systems (sec. 5.4), the equation V 2 F = 0, when tra 
polar coordinates, reads : 

1 d( 2 dV\ 1 6 ( BV\ 1 d 2 V 

r 2 6r\ dr ) r 2 sin 6 66 \ m 66 ) r 2 sin 2 6 dip 2 

Multiplication by r 2 sin 2 6 will isolate the term d 2 V /dtp 2 as tl 
depending on <p from the remainder of the equation. If, tl 
put it equal to — m 2 so that 

<i> = C e ±im(p 

(V being written as R(r) *0(0) • $(#>)), then eq. (15) takes the : 
sin 2 6 d f 2 A sin Q d . . . 9 

~r7A rR ) + -^di (smee) - m -° 


When this is divided through by sin 2 6 the terms involving r 
separated from those involving 6. Hence we obtain 


1 _J_ d_ 
0 sin 6 dd 


(sin 6B f ) 


m 

sin 2 6 


+ c = r ) 


iiw - 1 

where c denotes the same constant in both equations. It will 
venient to write this constant in the form c = l (l + 1). Let u 
the substitution cos 6 = x in eq. (17), obtaining (after multiplic 


LAPLACE S EQUATION IN THREE DIMENSIONS 


7.4 


s, however, is none other than the differential equation (cf. eq. 2-41) for 
>ciated spherical harmonics discussed in sec. 2.11. Special solutions 
e studied in sec. 3.6. They were written in the form 

0 = Pf{x) 

nust here be noted that these functions do not represent the general 
Ltion of eq. (19), but a particular one having the property of being finite 
all values of x between —1 and +1, including these limits. In most 
sical problems this is a condition naturally to be imposed on the solu- 
l of Laplace’s equation; there are cases, however, in which a more 
eral solution of (19) must be chosen. It was also found in sec. 2.11 
t the constant l must, for the sake of finiteness, be a positive integer, 
shall restrict the present consideration to problems in which these 
ditions hold, and assume 

0 = PT (cos 6) (7-20) 

This expression has no meaning unless m, also, is a positive integer, 
tin, the nature of most physical problems imposes this requirement, 
if V represents the distribution of any physical quantity in space, it 
3t obviously be periodic in <p and have a period of 2?r, since otherwise 
?) and V(2t + ip) would have different values although cp and 2t + <p 
ote the same angle. But the function (16) does not possess this 
iodieity unless m is an integer. 

The function R is now easily obtained by solving (18) which reads on 
ansion : 

r 2 R " + 2 rR' - 1(1 + 1)R « 0 

solution is obviously of the form R — r a , and on substitution we find 
a(a — 1) + 2a — 1(1 1) = 0 

hat a is either l or — (/ + 1). Hence 

R = air 1 + a 2 r~ l ~ l (7-21) 

view of (16), (20) and (21) we conclude that a solution of Laplace’s 
ation in polar coordinates has the form 

V lm = (a x r l + a 2 r' l - l )P? (cos d)e± im « (7-22) 

. the general solution will be a superposition of any number of such 
ctions. 

Other systems of coordinates in which the equation V 2 F ~ 0 can be 


7.6 


PARTIAL DIFFERENTIAL EQUATIONS 


224 


EXAMPLES OF SOLUTIONS OF LAPLACE’S EQUATION 

7.6. Sphere Moving through an Incompressible Fluid without Vortex 
Formation. — Since the motion of the liquid is irrotational, its velocity, v, 
at every point is the gradient of a scalar potential, F, which satisfies 
Laplace’s equation. Thus 

v = -VF, and V 2 F = 0 


Which of all the solutions derived above is to be chosen, depends entirely 
on the boundary conditions of the problem. These are, clearly : 

a. The radial velocity of the fluid at the surface of the sphere of radius r 0 
shall be equal to the velocity of the sphere times the cosine of the angle 
which r makes with the direction of motion of the sphere. Taking the 
latter as the polar axis, we have 

= COS 0 (a) 

r=r 0 


dV 

dr 


b. The distant portions of the liquid are not affected by the motion of 
the sphere. Hence 


dV 

dr 


= 0 


(b) 


The form of these conditions at once prescribes the use of polar coordi- 
nates. The solution is, therefore, of the form (22). To satisfy (a) we 
must put the angular part of this expression equal to cos 0; there is no 
dependence on <p at all. The only possible value of m which produces 
freedom from is zero, and of all the functions Pf (cos 0), only P? (cos 0) 
is equal to cos 0. Hence 1 = 1. Condition (a) now states: 

- ~ {air + a 2 r ~ 2 ) 
dr 

whence 

— on + 2a2?*o” 3 = v 0 


cos 6 = v o cos 0 

r=r 0 


But condition (b) cannot be satisfied unless a\ 
2a 2 rcT 3 = *>o- Eq. (22) has thus been reduced to 


F = 


vprl 

2r> 


cos 0 


0. Therefore 


This represents the velocity potential for the case in question. 

*7 A Qimnto Ac n mnHrtv. 


charged cylinder (wire) of infinite length, and a uniformly charged in mi •< 
plane. ^ „ 

a. The boundary condition in the first case is obviously V (ro,0,<p) 
a constant, provided we write r 0 for the radius of the sphere. Since sphet i ^ 
cal polar coordinates are used in describing this condition, the gem ni 
solution of Laplace’s equation must be taken in the form (22). I ho < se- 
dition also prescribes that V shall be independent of 0 and <p, f°r <>Miei v\ 
V{rM could not be constant. Hence m and l must both be zero.^ 
conclude, therefore, that V = + a 2 /r , and since a x + a o/f’o - ^ 

find on eliminating a 2 : 

To r o 


We 


-(-7) 


- + 


If we require in addition that 7 be zero at r - 00 (which would be t rue it 
the potential were produced entirely by the charged sphere) the constant 

CL\ — 0. 

b. The boundary condition in the case of the cylinder read:. 
V(p 0 ,z,<p) = V 0 , po denoting the radius of the cylinder. Solut ion (II) r , 
now relevant; but the observation that V is independent of <p and z lead.; 
at once to (14c) with bi — C\ = 0. Hence we have V = eti In p { • 

On eliminating a 2 by means of the boundary condition we find 


V=V 0 + O! In - 
Po 

The constant cq can be determined only when further facts, e.g., the eharge 
density on the cylinder, are known. (In fact, a.\ = — 2X wheri^ X is linear 
charge density.) 

c. In the case of the charged plane we require I r (x,?/,0) = V r ()T suppo ; 
ing z = 0 to define the plane. This leads at once to a solution of the form 
(12), but with ki = k 2 = 0; for otherwise V' would depend on .r and ij. 
Since then k 3 must also vanish, 

V = (aix + a 2 )(b x y + b 2 )(ciz + c 2 ) 

Again, to satisfy the boundary condition, = b x = 0, so that 

v = c l2 + y 0 

The constant can be eliminated when the charge density on the plane is 
known. ( C\ = — 4t<t if cr is surface density of charge.) 

All these results could have been obtained much more simply by apply- 
ing Gauss’ law of electrostatics; our purpose here was to exhibit them as 

CAln + lAnO T 


Problem To find the potential produced when a conducting spnere is 

originally uniform field of strength So, extending along the Z-bxb. Use 

conditions: . , N 

7 = 0 at r = r 0 (radius of sphere) 

V = —E$z = -E 0 r cos 6 a,tr -► « 

4m. F = — Sorcosfffl -(7)] 



Fig. 7-1 

7.7. Conducting Sphere in the Field of a Point Charge.— 
find the potential at P due to a point charge +q situated at 2 
2-axis when a conducting earthed sphere, distorting the fie! 
with its center at 0(cf. Fig. 1). Clearly, 

V(r,9) =-+U 
s 

if u is the potential due to the induced charge on the sphere, 
fwm n 1 o tt mnst, hr* a solution of Laplaces eciuation and i 


227 


CONDUCTING SPHERE IN THE FIELD OF A POINT CHARGE 7.7 


lently be written in the form (22). Here, however, it becomes necessary to 
retain full generality and use a superposition of harmonic functions : 

U - £ ( a lm r l + h lm f~ l ~ l )P™ (cos d)e ±im(p 

l,m 

From the symmetry of the physical distribution about the Z-axis it is clear 
that U cannot depend on <p; hence m — 0. Also, since U must vanish at 
r = <» , every a ltin = 0. Hence 

U = llbif~ l ~~ l P l (cos 0) (7—23) 

The coefficients h are to be determined by the condition that V shall be 
zero on the surface of the sphere : 

~ + I l ~ l Pi cos (6) = 0 

^ on sphere l 

The first term on the left can be expanded by means of a theorem proved 
in the discussion of the Legendre polynomials (eq. 3-24 et seq.) 

1 1 00 / rV 

Pi (cos 6) if a > r (7-24a) 

s a z- o \a/ 

Hence the foregoing condition becomes : 

X [a (a )* + Pl (cos e) = 0 

But this is satisfied only if the coefficient of every Pi is zero, so that 

bl = -ga-'-’ro 2 ^ 1 

On substituting this back into (23) we find 

U = - ? - £ (^Ypi (cos 9) (7-25) 

r a i \ar / 

a result which permits a very simple and interesting interpretation. Con- 
sider a point, such as o' (cf. Fig. 1) on the Z-axis. If r > o', the expansion 
of 1/s' may be seen to be (see derivation leading to eq. 3-27) 

= -l(— Yp* (cos 6 ) ; r > o' (7-24b) 

s r i \r / 

in contrast to (24a). But (25) is of the same form as this; indeed it 
becomes 


7.8 


PARTIAL DIFFERENTIAL EQUATIONS 


228 


if we put a! = r§/a. Our final result may now be written 



(7-26) 


provided q is identified with (r 0 /a)q. In words: when a conducting 
(earthed) sphere is placed near a point charge +q it changes the potential 
in the same manner as would a point charge of opposite sign and magnitude 
q = ( r 0 /a)q , placed at the point a = r\/a. The charge q is said to be 
the image of q. 

The same reasoning holds when an earthed plane is placed near a charge. 
For, suppose we put a — r 0 + A, a' = r 0 — A f and let r 0 go to infinity. 
From a* a = we then get A = A f , and r 0 /a approaches 1. It is seen 
that the effect of the plane can also be expressed by means of an image 
charge which, in this case, has the same magnitude as the real charge and is 
located at its mirror image. 


Problem. Find the potential of an electric dipole, and of an axial electric quadmpole. 
A |q U ^^ 0 | e | may he defined as a distribution of charge whose potential, while 


vanishing at infinity, is proportional to 


Pi (cos 0)1 
F* 2 (COS 0)J 


. COS 0 

Ans. ci — 7T J 


C2 


(3 cos 2 d - 1). 


7.8. The Wave Equation. — To give a concise definition of a wave in 
physical descriptive terms is not an easy matter; mathematically it is 
defined as the condition of a physical quantity, U, which satisfies the 
differential equation 

v 2 V 2 U - S = 0 (7-27) 

at 


For a reason which will soon be evident, v is called the phase velocity of the 
wave. In general, v may be a function of space coordinates (wave travel- 
ing in a non-homogeneous medium). When this is true, eq. (27) has an 
enormous variety of solutions, some of which would hardly conform to the 
more intuitive conception commonly attached to the word wave. This 
general case is of special interest in quantum or wave mechanics and in 
certain branches of optics and will be dealt with in Chapter 11. 

In the present section, v will be considered constant, that is, independent 
of space and time. Before examining eq. (27) by the method of separation 
of variables, we discuss a form of solution which is interesting from a 
nhvsical noint of view. For it happens that this eauation can be solved 



229 


THE WAVE EQUATION 


7*8 


a , 0, 7 being constants, provided V 2 is written in its Cartesian form. On 
substituting this, (27) takes the form 

[»V + £2 + y2) _ v2] = 0 

which is clearly satisfied if we put 

or + /3 2 + 7 2 = 1 (7-28) 

Subject to this condition, the substitution 

r\ — ax + py + yz — vt 

will also lead to a solution U ( 77 ). The functional form of U is left entirely 
arbitrary aside from the requirement that it must permit of two differentia- 
tions. We conclude, therefore, that 

U -/ift) + /a(l) (7-29) 

is a general solution of the wave equation (with constant v). 

Relation (28), however, allows the interpretation of a, /5, and y as direc- 
tion cosines , 3 that is, as components of a unit vector, or. Eq. (29) then 
takes the form 

U = fi(cr • r + vt) + f 2 (ar • I — vt) (7-30) 

Now constant values of /i(cr • r + vt) are defined by or * r = —vt; they lie 
on a plane traveling along —a with a velocity v. Constant values of 
/ 2 ( cr • r — vt) are given by cr • r = vt; they lie on a plane traveling along 
+<r with velocity v. The representation (30) therefore describes two 
plane waves traveling in opposite directions with the same speed. 

A solution of equal simplicity may be obtained when (27) is written in 
polar coordinates provided we assume that U is a function of the radius 
vector and t alone. (The solution here derived is therefore far from 
general.) In that case, V 2 reduces to d 2 /dr 2 + (2 /r)(d/dr), and the 
equation reads 

tf d 2 (rU) d 2 U 
r dr 2 dt 2 

The substitution £ = r + vt, rU = P converts it into v 2 (d 2 P/d% 2 ) — 
v 2 (d 2 P/d% 2 ) — 0; hence P = fi(r + vt). A similar result would have been 
achieved by choosing rj = r — vt in place of £. Hence 

P =/i(r + vt) +f 2 (r - vt) 



PARTIAL DIFFERENTIAL EQUATIONS 


230 


r.8 


This solution represents two spherical waves, one traveling in toward the 
origin, the other out from the origin. The factor 1/r, without which U 
would not be a solution of (27) and therefore not a wave, accounts for the 
attenuation of a spherical wave as it moves out from its source. 

By suitable choices of f\ and f 2 a great variety of wave complexes can be 
formed, of which standing waves, defined by the condition U(r,t) = 
F(r) • G(t) where F and G represent new functions, are perhaps the simplest. 

Problem. Show that, if /i and /2 are both sine functions, written in the customary 
form sin (2x/X) ( rdb vi), U represents a standing wave. 

We now turn to a more detailed analysis of the wave equation, based on 
the method of separation of variables. On assuming that 

U = ST 

where S is a function of space coordinates and T a function of t only, 
(27) is changed to the form 

2 V 2 S T 
S " T 

the dots denoting time derivatives. Each side of this equation must equal 
the same constant which, for convenience, we shall cal! — w 2 . No supposi- 
tion concerning the reality of co is here implied, although will turn out 
to be real in the more interesting practical applications. The equation 

T + o) 2 T * 0 

has the general solution 

T* = cie iu>t + (7-32) 

The constant co, clearly, has the meaning of an “ angular ” frequency. 

Now the space part of the wave function is defined by the equation 

V 2 S + yS=0 

The constant ai/v will henceforth be denoted by fc; in terms of the wave 
length X, which is related to co and v by the well known formula 



k = 2*/X. It signifies the number of waves of given co per 2t units of 
length and is called the wave number . The equation 


23 ! 


TWO DIMENSIONS 


7.10 


space form of the wave equation. The remainder of the present section will 
be devoted to its study. 

7.9. One Dimension. — Eq. (33) reduces to the simple form 


<PS 

dx 2 


+ k 2 S = 0 


which has the solution 


S k = ae ikx + be~ ikx 


One such solution is obtained for every value of k. For k = 0, 
S 0 = ax + b. It should be noted that 

S = LS& 

c k 

is not a solution of (33), but that 

U = TSkT k 

k 

is a solution of (27). (We are writing T k in place of T u because k is 
fixed when is chosen.) Similar caution is required in all subsequent 
considerations. 

7.10. Two Dimensions. — a. Rectangular Coordinates. The work goes 
as in sec. 7.3. In place of eq. 4 we now have 

•yff v tr 


Separation is achieved by putting 



and requiring that 
Hence 


k\ + - k 2 


Skfy = XY = 


b. Polar Coordinates. In place of (9) there results 
P" p' 

p2 T + p F + T + pfc = ° 


(7-34) 


On equating $"/$ to -to 2 , the radial equation becomes 
P 2 P" + pP' + (fcV ~ to 2 )P = 0 

It is identical with Bessel’s (eq. 2-57) when the independent variable is 



7.11 


PARTIAL DIFFERENTIAL EQUATIONS 


282 


or, more generally, 

Sk = Ha m Z m (kp)(± imv (7-35) 

m 

7.11. Three Dimensions. — a. Rectangular Coordinates . Immediate 
generalization of eq. (34) shows that 

SkM, = c klkl k 3 e ±i(k '* ±! ™ ±k ^ (7-36) 

provided that fcf + fcf + k\ = k 2 . If k u k 2 , k z are taken to be real (an 
assumption destroying the generality of the solution) they may be regarded 
as the components of a vector k, and (36) may be written 

-S(k) = c(k)e*' r (7-37) 

When this result is combined with (32) one sees that a solution of the wave 
equation (27) has the form 

V = Tc(k)e ick ‘ r±kvt) 

k 

or 

U = J c(k)e i(k - r±ftt,< >dk (7-38) 

The notation 4 used here, which is rather common in modern physics, is to 
be understood as follows: A function of a vector, such as c(k), is simply to 
be regarded as a function of the three real variables fc lr k 2 , and k 3 \ dk 
is an abbreviation for the product of three differentials: dkidk 2 dk 3 . Sum- 
mations and integrations over k are therefore threefold. 

Eq. 38 is a very useful form of the solution of the wave equation. 
Physically, it corresponds to the construction of a general wave by super- 
position of plane sinusoidal waves. It also permits initial conditions to 
be included in the calculation rather easily. For suppose that we know 
the form of the disturbance at t =0, Uo(x,y } z). The c(k) are then given 
at once by the Fourier analysis of this function, viz. : 

U 0 (x,y,z) = J c(k)6 ik ‘ r dk 

and (38) represents the wave at any other time. 


Problem a. Show that, in general, 


U(x,y,z,t) = (2*-)~ 3 //////: Uo (x f y'z')e ^-rO ±k»t\ dx ' d y f dz f dkidk^d fa 


4 This notation is indeed ambiguous. In vector analysis, dk is the element of a 
vector, and hence itself a vector. Here it means the element of volume in k- snace. 


233 


THREE DIMENSIONS 


7.11 


If Uo is concentrated at the origin, that is, if Uq is the limit of a function which tends to 


» at the origin, but in such a way that 
then 

U 


U $(x,y,z)dxdydz = 1, 


Iff' 

/ r (x,y,z,l) = (2ir)~ 3 J^ J* J* e f (k * r_ kvt) dk\dkidk% 


Problem b. Show that 

(a) U decreases continually with time at r = 0. 
(0) U is zero wherever | r | > vt. 

Note the physical significance of these results. 


b. Cylindrical Coordinates . The substitutions in sec. 7.4b lead to the 
ordinary differential equations 

Z" = -* 2 Z 

<$>" = 

p 2 P" + pP' - [(/c 2 - k 2 )p 2 + l 2 W = 0 
The last equation has the solution 

P = Zi(Vk 2 - K 2 p ) 

Consequently 

S k ,. t i = ce ±litz±w Z L (Vk 2 - k 2 p) (7-39) 

If this function is to be single-valued in (p, l must be an integer. Con- 
structing a solution of the wave equation wherein the space function has the 
form (39) we thus obtain 

U = z S k ,.je ±iM 

kd 

But it is usually more satisfactory to indicate the nature of the summations 
(l is integral, k and k may vary continuously) more explicitly. If, further- 
more, we limit Vk 2 — k 2 to real, positive values (thus again destroying 
generality) and call this quantity /z, the following useful representation is 
obtained: 

U = f e£ ikvt dk £ e u * f g l {k,n)e ki ' / ®~* lz JiMfidu ( 7 - 40 ) 

V / = — 00 Q 


where we have mitten c = gy for convenience later, and Ji is the Bessel 
function defined in sec. 2.14. 


•f trr-vo nrAEInm in nrkinli 


/AH') ia iiQarl ic fVna SnrmAOo fVio+ n 4 - 


234 


7.11 PARTIAL DIFFERENTIAL EQUATIONS 

gration over dk is absent). Then 


Uo(p,<p) = £ e ilip (7-41) 

l — — O- 1 ^0 

and from this relation all coefficients gi(p) can be determined. For if we 
multiply both sides of (41) by e~ iVtp and integrate over <p from Oto 2w, we 
obtain 


f gv(p)Jv (w)vdn = f U 0 (p,<p)e %v<f> dcp = Uv{p) 

I'K O 0 


This, however, is nothing other than a Fourier-Bessel transformation 5 of 
Uv (p), and it follows that 


Qi(p) 



Ui(p)Ji(pp)pdp 


Problem. Show that the diffraction pattern due to a plane monochromatic wave 
passing through a circular aperture of radius a is given by 


U(p,z) = const. 



Jo(jip)Ji<jia)e iVk2 -> lU d» 


c. Spherical {Polar) Coordinates. The equation for S is similar to (15), 
except that the term +fc 2 S is also present on the left. The substitution 


8 = R(r) • 9(6) • $(*) 


now leads to the three equations 

= — m 2 $ (7-42a) 

1 rt W? 

- — -f (sin ee') - -T- 5 - e + i(i + i)e = o (7^2b) 
sm e de sm 2 6 

? I + [*’ - !1 ^] R - 0 < 7 - 42c ) 

The second of these is the equation for associated Legendre functions {l 
and m are integers again : m to insure single-valuedness in <p, l in order that 
the solution 0 be a polynomial, i.e., that it should not diverge for 
cos 0 = dtl). The third equation may be transformed as follows. Put 
R = P/r , and change the independent variable to t = kr. Eq. 42c then 
takes the form 


;ain, put P = V7 Q, so that the last equation reads 


fQ ldQ 

d£ 2 £ cfa 


1 - 


o+j) 


>n 

- Q = 0 


is is at once recognized as Bessel’s equation (2-57); hence 


Q = ^-f-l/2 (0 

that 

R = cr~ ll2 Zi+ l/2 (kr) 

»r the space part of the wave function, we thus find 

S k = £ c fciZ , m Pr (cos 6)e i ^r- ll2 Z l ^ 1/2 (kr) (7-43) 

m,l 

sum of the form 

E c^Pf (cos 

m~ —l 

th arbitrary coefficients c m is often called a spherical harmonic and denoted 
' the symbol Yi (B y <p). In using this symbol one must remember that the 
action which it represents is not unique, but contains 21 + 1 arbitrary 
nstants. With this abbreviation, then, 

Sk « £ c k jYi(d,<f>) r~ ll2 Z( + i /2 (kr ) 

0 


d the wave function is 


17 = fdfc £ c k , l Y l {6,< P )r- l l*Z l+1/2 {kr)e± ik '’ t (7-44) 

ifc «/ i-0 

7,12. Examples of Solutions of the Wave Equation. — The local pressure 
in a gas traversed by a sound wave, satisfies the wave equation. 

a. The simplest type of a wave is that emitted by a “ breathing ” 
here, i.e., a sphere performing volume oscillations without distortion, 
is characterized by the two boundary conditions: 

:) P r =, 0 = const. e~ iut , 

) P r *« =/(r,0)e <Cfe -“‘> 


mdition (a) states that at the surface of the sphere (of radius ro) all 
tints shall be in phase; condition Q3) implies that at infinity the wave 
all be an outgoing one. We limit ourselves to monochromatic waves 
ure tones), so that there is only one value of ft or w. Clearly, spherical 
ila.r coordinates must here be used. Considering then eq. (44), we must 
St omit the integration over k. Since an accordance with condition fa) 


7.12 


PARTIAL DIFFERENTIAL EQUATIONS 


2 'M\ 


there must, at r = r 0 , be no functional dependence either on <p or on d, 
both l and m are zero. Hence (44) reduces to 

P = Cf~ l ,2 Zi /2 (kr)e~ iut 
But the general Bessel function 

Zi J 2 (x) = a\J i /2 (x) + a 2 J- 1/2 (x) 

as was shown in Chapter 3. Inserting these, we have 

^ ^ / sin kr cos kr\ . . 

P = CU— a 2 — — )c"*- £ 

\ kr kr / 


In order to satisfy condition (/3) we put ai = i, a 2 = — 1, obtaining 

C 

p __ — wO 


as our final result. 

b. When the sphere of the preceding example vibrates, not with spherh 
cal symmetry, but in such a way that condition (a) reads 

(a) P r==ro = const, cos 6e~ loit 

it is said to emit dipole waves. Condition (/ 3 ) remains unchanged. Of all 
the functions composing Fz ($,<,<?), only P? (cos 6) is a cosine function. 
Therefore l must be 1. Hence (44) now reduces to 

P - Cr- ll2 Z 3/2 (kr) cos 6e~ iut 
But 

r~ ll2 Z 3l2 (kr) = r -1/2 [ffl 1 J 3/2 (/cr) + a 2 J_ 3 , 2 (kr)\ 
and this is proportional to 


ai 


sin kr 
. (kr) 2 


cos kr 
kr 


+ &2 — 


sin kr 
kr 


cos kr~\ 

Jkrf J 


If this expression is to satisfy condition Q3), it is necessary to choose 


so that 


and 


U] = — 1 , U 2 = — f 

+ 7^1 


37 


EQUATION OF HEAT CONDUCTION AND DIFFUSION 


7.13 


art of P, which alone is of interest, will be 

( G C \ ( C C \ 

IP = — ~j^ 2 j cos ^ cos (^ r ~ °°0 “ \ p 2~2 + j cos 0 sin (At — ot) 

'or small values of r, 
cos 0 


&P = 


2 0 [Co cos (At — coi) + C\ sin (At — co^)], 
k r" 


>r large r, 


&P = 


cos 0 
kr 


[Ci cos (kr — c^t) — C 2 sin (At — co£)] 


f Ci is zero, the disturbance is of the form cos (kr — cot) near the sur- 
xce of the sphere, but of the sine form at infinity. If C 2 = 0 , the 
averse is true. There occurs, therefore, a curious change of phase as the 
rave moves outward. 

7.13. Equation of Heat Conduction and Diffusion. — The temperature 
J in a homogeneous medium, in which A(x,y,z) calories of heat are gener- 
ted (by some unspecified agency) per unit of volume surrounding the 
oint ( x,y,z ) per second, and which has density p, specific heat s, and ther- 
lal conductivity k, satisfies the partial differential equation 


dU 

dt 


- - V 2 U + - 
ps ps 


(7-45) 


Various simplifying conditions may arise: In the first place, attention may 
e confined to “steady states,” that is, to temperature distributions which 
.0 not change with time. Such states will always occur in physical and 
hemical problems after heat conduction has taken place for a sufficiently 
3ng time. In that case, <9 U /dt is zero, and the equation reads 

V 2 U ■= - — (7-46) 

K 


t is of the form of Poisson 7 s equation which will be discussed in sec. 7.17. 
f, in addition, it is assumed that no heat is generated anywhere within the 
>ody, A - 0 and (46) becomes identical with Laplace’s equation which 
re have already studied. 

Of greater interest is the situation in which, to be sure, A is taken to be 
ero, but consideration is given to non-steady states. The temperature is 
hen subject to the equation 


PARTIAL DIFFERENTIAL EQUATIONS 


7.14 


238 


which is very similar to the wave equation. (This equation is derived by 
vector methods in sec. 4.18.) 

In the kinetic theory, one meets the equation of diffusion which regulates 
the flow of fluid matter within another material medium. It states in its 

basic form: 


— = V • UW) (7-48) 

U represents the concentration of fluid matter, D its coefficient of diffusion. 
Strictly speaking, D is a function of U and hence of (x,y,z). But for small 
concentrations D is found to be very nearly constant. For that case, then, 
(48) may be written 


DV 2 U - — - 0 
dt 


(7-49) 


All parameters appearing in (49) as well as in (47) are positive, hence 
both of these equations will be written in the form 

n rr 

a 2 V 2 U - — = 0 (7-50) 

Ol 

and we remember that, for heat conduction, U - temperature and 
a 2 = k/ps, while for diffusion, U = concentration and a 2 = D. The 
remainder of this section is devoted to the solutions of eq. (50). 

Separation may at once be achieved by putting U = S(x,y,z) • T(t) } 
and it is found that a 2 V 2 S / S = T/T. On equating the right-hand side to 
— a 2 k 2 f k being an arbitrary constant, it is seen that 

T k = const. e- a2kH (7-51) 

while S must satisfy 

V 2 S + k 2 S = 0 (7-52) 

an equation identical with the space form of the wave equation, (33). If, 
therefore, we combine the solutions of (33), discussed in the preceding 
section, with T k in the form, (51), we have an answer to the problems of 
heat conduction and diffusion. 

7 . 14 . Example: Linear Flow of Heat. — Suppose that heat flows in a 
linear filament placed along the X-axis. The solution of (52) is then 

St = c k e ikx + d k e~ ikx 

and this may be taken as 

S h = c k e ikx 

if we assign both positive and negative values to k. The general solution 
reads: 



example: linear flow of heat 


7.14 


Every choice for c(k) will satisfy eq. (47), but the proper selection is to 
made in accordance with initial conditions. Let us suppose, then, that 
= Uo(x) at t =0. Eq. (53) now states: 

/ co 

c(k)e ikx dk 

l c(k ) may be obtained from this by means of a Fourier transforma- 
i. In view of eq. (8-13)' 

°(k) = d J* U 0 (x')e- ikx 'dx' 

;hat » 53) becomes 

U(x,i) = ~ f J Uo(x')e ik ^- x,) - a ‘‘ m dx'dk 


, integration with respect to k can be performed : 


e — a 2 tkH - i ( x—x * ) 


Vs- 


(x— x') 2 /4a 2 £ 


U(x,i) = - 77 = f U 0 (x')e 


/ — f v/uV**' 

2aVirt J -«> 

items. 

a. Prove that (54) reduces to Uq(x) for t = 0. 

b. Show that, if t/o(z) is a step function such that 


(x-x') 2 /4 aH dx t 


1 if x <1 

0 if x >1 




e <t>(x) is the error integral 


Show that, if 


1 for x > 0 


.0 for x"£ 0, 


then U “ \ | X + <f> 


■te) 


(7-54) 


pret the last two problems from the point of view of diffusion. 

.. Suppose Uo is a “ function ” which is everywhere zero except at x = 0, where it 


to oo in such a way that J Uo(z)dz = 1. (Such a “ function ” was introduced 
irac and is commonly known to physicists as a 5-function. Strictly speaking it is 


7.16 


PARTIAL DIFFERENTIAL EQUATIONS 


240 


Discuss the temperature at any point x, and show in particular that it will rise to a 
maximum at t = x 2 /2 a 2 . This fact affords a simple experimental determination of a, 
and hence of D and the thermal quantities. 

7.15. Two-Dimensional Flow of Heat. — In polar coordinates, S is given 
by eq. (35) Hence 

U = f c(k)dk(Ea m Z m (kp)e im<e )e~ a,m (7-55) 

If, as we shall suppose, the temperature distribution at t = 0 is radially 
symmetrical, so that U does not depend on <p, the only value permitted to m 
is zero. Also, since Z 0 is an even function, the integration in (55) may be 
taken from 0 to oo without error. For Z 0 we shall take the Jo-function, 
because it will at once be seen that most temperature distributions can be 
expressed in terms of Jo alone. Thus 


Let us write 


U = 



c{k)J^{kp)e- a " m dh 


c(k) = k g(k) 


(7-56) 


and suppose that U = Z7 0 (p ) at t =0. It is then easy to determine g(k) 
formally and hence £7(pJ). For in accordance with (56) 


U 0 



g(k)Jo(kp)kdk 


in other words, g(k) is the Fourier-Bessel transform of Uo(p). 
Hence 



Uo(p')Jo(kp')p'dp' 


(CL Sec. 8.3.) 


When this is put back into (56) the final form of U (p,t) is obtained: 

U (p,t) = f f Uo(p')Jo(kp')Jo(k P )e- aikit kp'dkdp' (7-57) 


Problem. Show that, if Uq(p) is concentrated at p — 0, and 

1 


s: 


Uo(p)pdp = 1, 


2 a 2 t 


p -f>V4aH 


Compare this with problem (d) of Sec. 7.14. Interpret above as a diffusion problem. 

7.16. Heat Flow in Three Dimensions. — In rectangular coordinates, U 
is given as a generalization of eq. (53) (cf. eq. (36) for the form of S ) : 


241 


poisson’s equation 


7.1' 


or, with the use of the vector notation previously explained (cf. footnote 
on p. 227) 

u => f f f c(k)«*’*- a ** , ‘dk (7-58] 

We now repeat essentially the procedure leading from (53) to (54), bul 
using three variables instead of one. 

U 0 (x,y,z) = f J J c(k)e* k ‘ r dk 

hence c( k) is the Fourier transform of Uq: 

c(.k) = ~ fff_l Uo(X ' ’ yf 


whence 

U{x,y,z,l) = ^3 ///// '/I 

= (2 aV7t)~ z J f J (7-59 

If U 0 is a function of ^ alone, the integration over y r and z may be per 
formed, and the result is identical with (54), as it should be. Of greates 
practical importance is the case where Uq is a function of r alone. Th 
volume element dx r may then be written in polar form: r t2 dr sin B* dBt d.<p 
and the integration over 6 ; and y can be performed. It is to be observe* 
in this connection that 

(r — X) 2 — r 2 + r 2 — 2 rr cos B 

One then finds 


U(r,t) = (2arV7t)~ l f* C7 0 (r')[e~ [(r_r ' )/2< * v ' t ' 2 - e" !(r+r ' )/2o ' /Tl VV/ 

Jo 

Problem. Show that, if 

7T fl for r ^ 1 
0 3=8 \0 for r > 1 

u - J[««+) + ♦<*-)] + -J- (e-&-e-t-) 

T TT 7T 


where 


f ± 


1 =fc r _ 

2a Vt ’ 





7 17 Pm ccnn , « TEmifltirtri — All martial Hififp-rAnf.ifll Amifl.t.inns trp.fl.tfif 


7.17 


PARTIAL DIFFERENTIAL EQUATIONS 


242 


that the method of separation of variables may work. The variety of 
linear and inhomogeneous equations of importance in scientific analysis 
is also great, but there exists for their solution no method nearly so 
powerful as the separation of variables. 

An equation like (33), the space form of the wave equation, would 
become inhomogeneous if the right-hand side were not zero but some 
function f(x,y, z). One remarkable feature of an inhomogeneous equation, 
which will here only be mentioned, is that it may not possess solutions for 
every value of k even though the homogeneous equation, with the same 
boundary condition, has solutions. The inhomogeneity selects, as it were, 
special values of the parameter k for which solutions are possible. This 
phenomenon, which is the rule for inhomogeneous equations, may also 
occur for homogeneous ones if the boundary or initial conditions of the 
problem are sufficiently stringent. It will be discussed under the heading 
“ characteristic values ” or “ eigenvalues ” in Chapter 8. 

An inhomogeneous equation which is rather common is Poisson's; 
it will here be chosen to illustrate a process of solution. Its general form is : 

V I 2 * * * $ = f(x,y,z) (7-60) 

One encounters it (1) in electrostatics, where 4> is the ordinary potential 
and/ represents a constant times the distribution of charge, 6 p(x,y,z ), the 
constant depending on the units chosen: (2) in the theory of heat flow, 
where eq. (45) takes the form (60) when dU/dt - 0, as shown in eq. (46). 

To solve (60) we first recall Green 7 s theorem (see sec. 4.19) which 
states that, for any two functions of space coordinates, u and v which are 
finite, continuous and have continuous first and second derivatives, 

I (uV 2 v - vV 2 u)dr = / (uVv - vVu) • da (7-61) 

** T ^ or 

Here r represents a certain closed volume and a its surface; dxr is taken 

positive in the direction outward from the volume. In our problem we are 

given the function f(x,y,z) and we wish to find §(xy r z f ) for a fixed point of 
observation {xyz). In the following it is necessary to distinguish 

between this fixed point, which will be denoted by primes, and the variable 
point {xyz) over which integrations are to be performed. 

It will prove convenient to consider, in connection wfith theorem (61), a 
volume r such as that depicted in Fig. 2. It is bounded by the outer sur- 
face <r 2 and the inner surface a i, a spherical cavity of radius sq about the 
fixed point P\ The function u will be specified to be 


1 


1 



243 


POISSON S EQUATION 


7.17 


it satisfies Laplace’s equation V 2 u =0, as may readily be verified. Then 
eq. (61) reads: 

Xt j ’-X[7-' 7 ©] <7 - 62: 

If now we interpret v as we may replace V 2 v by/ in accordance with (60). 
The right-hand side of (62) consists of two integrations, one over dcri and the 



Fig. 7-2 


other over da 2 . Consider first that over da\. Clearly, V 4> • da\ approaches 
— d$/dr\p'dvi as Sq tends to zero, the minus sign coming from the fad 
that da 1 is inward with respect to the cavity. Hence 



provided <f> has a finite derivative. The second integral on the right o 
(62), when taken over <j\ becomes, in the limit as s 0 0, 


7.17 


PARTIAL DIFFERENTIAL EQUATIONS 


244 


Hence, if it is assumed that Sq — > 0, eq. (62) reduces to 

f Ur = [“ ~ (")] ‘ (7-63) 

Here the remaining integral over a 2 on the right has a rather simple mean- 
ing. It is a solution of Laplace’s equation in the form: V /2< i >(x r ,y',z') =0, 
for the only quantities which depend on the primed coordinates are 1/s 
and V(l/s), and these clearly satisfy it. Hence if this whole integral were 
subtracted from <£, the remainder would still satisfy eq. (60). It is indeed 
easily seen that 

•CP-"©}* 

represents the contribution to <£> coming from those parts of f(x,y,z) which 

lie outside of r. In the electrostatic case, | represents the potential due 

to the charge outside of the volume r considered. 

The integral over cr 2 may be eliminated in another way. Suppose we 
allow r to become infinite and impose on $ the boundary condition that, at 
infinity, it vanish at least as strongly as 1/r. Then V$/$ and $V (1 /$) are 
both of order 1/r 3 at and after the surface integration, which amounts 
to multiplication by r 2 , the result will still be of the order 1/r and hence 
vanish. 

Of interest, therefore, is chiefly the particular solution which remains 
when the integral over <r 2 in (63) is omitted: it is usually referred to as the 
solution of Poisson’s equation. Thus 




i Cf( x >y> z ) j j , 

— I 1 7| dxdydz 

4?r r — r 


(7-64) 


Problem* Show that, when f(x,y,z) is different from zero only within a finite volume 


to such that 


j ' ffaVr 


z)dr = q , then for any point (: x'y'z' ) far removed from to, 




the origin being chosen inside tq. Interpret this result in electrostatics. 


REFERENCES 


Frank, P. and von Mises, R., “ Die Differential und Integralgleichungen der Mechanik 
and Physik,” Vieweg, Brunswick, 1935. (This is the eighth edition of the famous 



245 


REFERENCES 


Sommerfeld, A., “ Partial Differential Equations in Physics/’ Academic Press, New 
York, 1949. 

Bateman, H., “Partial Differential Equations of Mathematical Physics/’ Dover 
Publications, New York, 1944. 

Tamarkin, J. D. and Feller, W., “ Partial Differential Equations/’ Brown University, 
Providence, 1941. 

Webster, A. G., “ Partial Differential Equations of Mathematical Physics,” Stechert, 
New York, 1933. 

Coulson, C. A., “ Waves,” Oliver and Boyd, Edinburgh, 1941. 



CHAPTER 8 


EIGE3WALUE8 AND EIGENFUNCTION S 

8.1. Simple Examples of Eigenvalue Problems. — It frequently happens 
in mathematical analysis that a given equation, or a set of equations, 
yields solutions which are in general uninteresting or trivial, except when 
a certain parameter appearing in the equations is given a definite value. 
Such circumstances give rise to eigenvalues 1 and eigenfunctions. 1 Their 
occurrence is so common that it often goes unrecognized. For illustration, 
let us take a very simple (and useless) example. 

Suppose one wishes to solve the two simultaneous equations 

(1 - \)x + 2y = 0, 2x + (1 - \)y = 0 

To be sure, they always possess solutions; but they are almost always 
x = 0 r y = 0. Only for two values of the parameter X will this not be true: 
forX = 3 the solution is x = y; for X = — 1 it is £ = —y. (The numerical 
values of x or y are of course never fixed by the linear homogeneous equa- 
tions above.) The two values of X for which the equations possess non- 
vanishing solutions are said to be eigenvalues; the two corresponding solu- 
tions are called eigenfunctions. 

Eigenvalues are not always denumerable and discrete, as in the fore- 
going example. To show this, we choose an even more trivial illustration. 
The equation 

x 2 — X 

always possesses a solution. If, however, we wish a real solution, X is at 
once limited to the domain of positive numbers. Hence we may properly 
say that x 2 = X is an equation leading to eigenvalues: X S 0 and corre- 
sponding eigenfunctions x = Vx. 

In both examples eigenvalues were called into being by the imposition of 
special conditions: in the first that the solutions shall not vanish every- 
where; in the second that the solution shall be real. This is generally true; 
eigenvalues are always produced by special requirements placed upon the 
solutions of equations. In the most interesting cases of physics and chem- 



247 


VI BEATING STRING; FOURIER ANALYSIS 


8.2 


ditions are boundary conditions. We now turn to some cases of greater 
scientific interest. 

8.2. Vibrating String; Fourier Analysis. — In classical physics, many 
eigenvalue problems occur in connection with vibrating systems. The 
simplest of these is the problem of a vibrating string. Consider the string 
to extend along the X-axis, to be fastened with its left end at the origin and 
its right end at x = l. From elementary physics we recall that if its mass 
per unit length is m and its tension T, the speed of waves along the string 
is given by v = Vf/m. The wave equation, discussed in sec. 7.8, will 
then read 




( 8 - 1 ) 


U is the vertical displacement of the points along x . 

We restrict our attention for the moment to types of vibration having a 
single frequency v, or angular frequency w = 2irv, so that U = S(x)e l03t or, 

( sin cot\ 

or I • The 

COS cot) 

function S will then satisfy the ordinary differential equation 

?? + IrS = 0 ( 8 - 2 ) 

(IX“ 


where, in conformity with the usage of Chapter 7, the abbreviation 


CO 2l T 

v X 


(8-3) 


has been used. Here X stands again for the wave length of the disturbance 
produced. The general solution of (2) is, clearly, 

5 = A sin (kx + 5) (8-4) 

where A and 8 are arbitrary constants. Every solution of the form (4) is 
perfectly acceptable as far as the differential equation is concerned, but it 
does not describe the behavior of the string. Solution (4) permits the ends 
of the string to vibrate, whereas the physical condition requires them to be 
fixed. It is therefore necessary to impose the following boundary condi- 
tions upon the solutions (4) : 

(a) 

(b) 


5 ( 0 ) - 0 
S(l) - 0 



EIGENVALUES AND EIGENFUNCTIONS 


B.Z 

arbitrary constant, 5, left for adjustment. It must be ta 
order to satisfy condition (a). (Choice of ?r, 2?r, etc., L 
final result.) But the function S = A sin kx will not obe 
problem can be solved only if we are willing to tamper witl 
eigenvalues. If sin kl is to be zero, k must be 0, or 7 r/Z, 2n i 
value 0, however, is excluded for the same reason that A 
To each eigenvalue of k = nw/l (n integral), there 
eigenfunction S n = A n sin nrrx/L These eigenfunction 
undetermined with respect to the constant multipliers, A 
chosen at will, and which may be different for every n. 

Since k is related to X by eq. (3), there is thus generate* 
set of eigenvalues for X, namely X = 2 l/n, n integral, 
known equation for the wave lengths of standing waves 
vibrating string. In the simplest mode of vibration, con 
fundamental frequency, X = 21, the string has nodes only j 
For the first harmonic, X = Z, there is in addition a node al 
string, and so on. In general, the number of nodes is n 

The eigenfunctions under consideration have two imp 
which, as we shall see in sec. 8.5 et seq., are common t 
eigenfunctions arising in connection with different pro! 
(1) orthogonality, (2) completeness . To explain the meani: 
let us arrange the eigenvalues of eq. (2) in a definite < 
n = 1, 2, 3 • • • ; and write again S n = A n sin mrx/l 
means: 



S n (x)S m (x)dx 


c n S n 


The word comes originally from vector analysis (cf. Chaf 
vectors, A and B, are said to be orthogonal if A • B s= 
A S B Z - 0. Similarly, vectors in N dimensions having c< 

N 

(i = 1, 2, • * * N) are said to be orthogonal when 22 4A 

t=i 

imagine a vector space of an infinite number of dimensi* 
components A{ and Bi become continuously distributed 
dense, i is no longer a denumerable index but a contini 

and the scalar product £ AJBi turns into J A (x)B(x)dx. 

functions A and B are said to be orthogonal, and this is t 
the word is used above. 

The idea of orthogonality is indefinite unless referei 


249 


VIBRATING STRING; FOURIER ANALYSIS 


8.2 


and the constant c n is seen to be 



• — I sin 2 udu 
uttJq 



For many purposes it is convenient to have c n equal to unity. This 
can always be achieved by a suitable choice of A n . In the present case, 
every c n = 1 if A n - V2/L If, therefore, we write S n = V2/Z sin mrx/l, 
the orthogonality relation (5) reads 



S n (x)S m (x)dx 


$nm 


(8-5') 


When the constants A n are thus chosen the eigenfunctions are said to be 
normalized; functions satisfying (5') will henceforth be termed ortho- 
normal. It is clear that a set of functions having the property of orthogon- 
ality (expressed by eq. 5) can always be made ortho-normal by a proper 
choice of multiplicative constants. 

A simple modification in the idea of orthogonality is to be made when 
complex functions are considered. For these, condition (5) must be re- 
placed by 

f S*(x)S m (x)dx = c n S nm (8-5*) 


where S* represents the complex conjugate of S . This definition will be 
used in later work. 

We turn to the second property, that of completeness . A set of functions 
is said to be complete if an arbitrary function, /(re), satisfying the same 
boundary conditions as the functions of the set, can be expanded as follows: 

00 

/(*) = T,a n S n (x) (8-6) 

«=1 


the On being constant coefficients. 

In the present instance, eq. (6) is equivalent to the theorem of Fourier 
which states, in its simplest form, that a function fix) which vanishes both 
at x = 0 and at x =* ir (and has but a finite number of finite discontinuities) 
may always be written 2 


f(x) = 2b sin nx (8-7a) 

n«* 1 

2 See. for instance. Byerly, W. E., “ Fourier Series and Spherical Harmonics.’' 



1.2 


EIGENVALUES AND EIGENFUNCTIONS 


250 


he coefficients being given by 

an -- f f(& sin (8-7b) 

7T J o 

5qs. (7) may be modified by using, in place of x and {, the variables irx/l 
md t£//. This has the effect of changing the range of x from (0 ,tt) to 
'0,0, and the results are 

fix) = Sin (y xj (8-8a) 

= jf fd) sin (y ^ # (8-8b) 

US is taken in its normalized form, these equations read simply 
fix) = £a n s n (x), a n = tfiOSnim 

n = l 0 

The fact of completeness has an important bearing on the problem of the 
/ibrating string which we originally set out to solve. While it is true that 
)nly a particular S n , for which k assumes a specific eigenvalue, is a solution 
>£ eq. (2) [the series (6) would not be a solution of (2)! ], the value of k is 
lot prescribed by eq. (1). Hence eq. (1) is satisfied by 

~U ~ ^2 €~n (»T ) COS GJjit } CO n ~ vk n 

n 

vith arbitrary coefficients c n . This, then, is the most general solution of 
die string problem. It reduces to a series like (6) for t = 0, a series which 
;an be chosen to represent any function f(x) which vanishes at the end 
joints. Hence it is seen that any initial configuration of the string will 
field a solution of eq. (1), that is, a (standing) wave. 

Fourier analysis is so useful a tool in applied mathematics that it seems 
veil here to digress for a moment and summarize its essential features 
>eyond the needs of the present problem. Details may* be found in the 
jook by Byerly already mentioned. The general theory, including proofs 
or the statements here made, will be found in Secs. 8.5-8.8. A function 
\x) defined between x = 0 and x = t may also be expanded as a cosine 
series : 

00 

fi x ) — + X K cos nx 

n-1 


(8-9a) 


251 


VIBRATING STRING! FOURIER ANALYSIS 


M 


except that the series may not yield the same values as /(#) at discontinu 
ties and at the end points. Otherwise the developments (7) and (9) ai 
equivalent. There is, however, an interesting difference in the values of tt 
two series when they are extrapolated to the range ( — tt, 0) . Here the seri< 
(7) changes sign in such a way that /(—a) = — /(x), while series (9) yielc 
/(— x) = /(x), as is evident from the fact that sin x is an odd, cos x an eve 
function. Thus, if it is desired to expand a function between —t and +• 
series (7) can be used only if the function is odd, series (9) when it is evei 
Now any function can be represented as the sum of an even and an odd on 
Hence, if an arbitrary function /(x) is to be developed between —t and - 
both cosine and sine series must be used. It is evident, therefore, that i 
this more general case 

GO CO 

/(x) = 2 a n sin nx + § b 0 +J^b n cosnx (8-lOs 

7i=l n=*l 

where 

On = “ f /(£) sin n£d£; h n = - f /(£) cos n£d( (8—101 
tt ir «/_ T 

The coefficients in front of the integrals are most easily checked as follow 
Multiply (10a) by sin mx and integrate over x between — t and tt. Becaus 
of the mutual orthogonality of the functions sin nx, sin mx, cos nx, for n ; 
m the relations (10b) are at once apparent. 

If/(x) is defined, not in the range ( — x,ir), but in ( — 1,1), a simple chang 
of variable from x, £ to (tt/1)x, (tt/Z)? in eq. (10) will produce the require 
modification. The result is 

f(x) = £a n sin — x + %b 0 + XX cos — x 

n l n l 

a n ~ y J* /(£) sin y f>n = cos y 

This may be expressed more simply in complex form. For if the sine ar 
cosine functions are written in their exponential form, the reader will veri 
without difficulty 3 that 

fix) = Iw””' 1 , c» - (8-1: 

The coefficients Cn in this expansion are complex. 

When the series /(x) as given by (12) is extrapolated: beyond the ranj 

( — 1 +>»#* funrvHrm is nprinHinsllv in pvprv intprva.l Kp+wpi 



1.2 


EIGENVALUES AND EIGENFUNCTIONS 


252 


[2 n 4 1)Z and (2 n 4 3)L Hence formula (12) permits representation of 
>eriodic functions only. One may wonder, therefore, whether it is possible 
,o perform a Fourier analysis of a non-periodic function, defined in the 
■ange of the entire real axis. Highly technical considerations for which the 
eader is referred to more specific treatises 4 affirm this possibility, provided 
he function, /(x), to be expanded is piecewise continuous and such that the 

ntegral J \f(x)\ dx exists. In that case 

/(*) = J c{k)e ikx dk 

c(k) = <tf (8_13,) 


Fhese equations may be written more symmetrically by putting c(k) = 
1 /V 27 ~r)g{k). They then become 


J{x) =-h f” g(k)e ikx dk 

V 2 7T — « 

g(k) =4= f me~ ikt dZ 

V 2ir J ~ ® 


(8-13) 


[’wo functions / and g related by eq. (13) are called a pair of Fourier irons- 
orms; i.e., g is the Fourier transform of / and vice versa. Such pairs are of 
Teat importance in the analysis of electrical impulses 6 and in quantum 
aechanics, where they effect the transformation from coordinate to mo- 
mentum space. 


Toblems. 

a. Show that the Fourier transform of f(x) = e~ x2/2 is g(k) = e —* s / 2 . (This fact is 
ccasionally expressed by saying: the error function e~ x2{2 is its own Fourier transform.) 

b. Show that the F.T. of the step function f(x) = 4242 1 if ] x | <1 and vanish- 
ig if \x | >1 is g{k) = si nkl/kl. Note: as l approaches zero, f(x ) becomes °o at 

= 0. It is then called a “ unit impulse ” function, or a 5-function. Its transform 
(*) - 1 . 

4 E.g., Titchmarsh, E. C., “ Introduction to the Theory of Fourier Integrals,” Oxford 
fniversity Press, 1937. 

5 For further considerations see v. Kdrm4n, T., and Biot, M., “ Mathematical 
lethods in Engineering,” McGraw-Hill Booh Co., 1940. An extensive list of Fourier 
'ansforms has been compiled by Campbell, G. A., and Foster, R. M., “ Fourier Integrals 


253 


VIBRATING STRING; FOURIER ANALYSIS 


8.2 


C Show that the F.T. of f(x) =* J *V 2 


“ cos fc 0 x if I X I < Z sin [(/c 0 - fc)Z] 

, . k 0 — k 

0 if he >2 


The Fourier Integral Theorem may be deduced immediately from (13). 
3n putting g(k) into the integral for/(x), there results 


/(*) = £ f f_lme ik <*~VdkdZ 


(8-14) 


SVhen/(x) is real, the imaginary part of e lk(x ^ may clearly be neglected, 
md the Fourier integral theorem takes the more customary form : 


m ~~ f“dk f“m cos k(x - m (8-i4') 

ZtT *■’ — 00 U — ca 


Finally one may derive from (14) a result sometimes called the Dirichlet 
ntegral. On performing the integration not between infinite limits but 
between —A and A, and then passing to the limit A — » oo we find 

fix) = - lira /"/(f) Sm[A(a? ~ * )] df (8-15) 

7T jA —b. « ^ - - X — t 


(8-15) 


As a special form of (15) we note: 


/(0) = 1 lira f fix) 

7T 4 oo O — oc 


1 . sin [A (x — £)] 1 r OT . .. . . 

The expression - hm — or —I e z(z ^ l dt is called the 

Dirac 5-function and denoted by 5 (x,£). Eq. (15) may therefore be written 


fix) = 

— 00 


(8-15') 


All the foregoing results can be generalized 6 to permit expansion of func- 
ions of several variables, provided they satisfy the condition 


/ | f(z,y,z •••) | dxdydz 


Jxists. For instance, in place of (12) we have 


/ {X)l/) ~~ X* c vt,n& 


i(ir J l) (mx-t-ny) 


= r7(f,)e— 


(8-16) 


B.3 


EIGENVALUES AND EIGENFUNCTIONS 


254- 


and in place of (13), 

f(x,y) =£ff g(k 1 k 2 )e i(k ' x+k!v) dk 1 dk 2 
g(kik 2 ) = ~ ff~ /«,» Oe-HV+to’d&v 


(8-17) 


8.3. Vibrating Circular Membrane; Fourier-Bessel Transforms. — The 
mathematical description of the vibrating membrane also leads to an inter- 
esting eigenvalue problem. The wave equation, when written in polar 
coordinates, was shown in sec. 7.10 (cf. eq. 7-35) to have the solution 

U = S • T, S k , m « Z m (kp)e ±im<p (8-18) 

Fhe fact, pointed out before, that m must here be an integer to insure the 
unction to be physically meaningful (e ±zm<p must be the same as e ±tm (P+ 2 *) 
:>ecause <p and <p + 2ir denote the same angle in the problem of the mem- 
brane) may also be expressed by saying : the eigenvalues of rn in the dif- 
ferential equation <f> // = — m 2 <f> are all integers. Note that the corre- 
sponding eigenfunctions, e im<p , are orthogonal and form a complete set, the 
*ange being (0,27r). But w r e wish here to discuss another, less simple 
eigenvalue problem. 

Consider modes of vibration of the membrane which have circular 
symmetry. This limits m to the value zero, and (18) becomes 

S k = Zo(kp) (8-19) 

Ve now impose the boundary condition: U = 0 at all times at the periph- 
ery of the membrane, corresponding to the physical condition of having the 
edge fixed. If the radius of the membrane is a, this means 

Z 0 (ka) = 0 (8-20) 

rhe function Z 0 is a linear combination of the Bessel functions J 0 and N 0 , 
l Bessel function of the second kind which is linearly independent of Jo 
sometimes called a Neumann function). But the latter may be shown to 
>e infinite at p = 0 and must therefore be excluded. The Z 0 in (19) and 
20) must therefore be interpreted as J 0 . To satisfy (20) the parameter k 
oust be so adjusted as to make ka a root of J 0 , and since J 0 has an infinite 

lumber of roots, 7 the eigenvalues of k will form an infinite set k{ = xja, 

v r here Xi is the i-th root of Jq(x). The corresponding eigenfunctions 

iTft Jn (k;n\ . 


A*re these functions orthogonal? It is not difficult to show that 

Ja(kip)Jo(k 2 p)dp is different from zero (an inspection of the graph of 

integrand will convince the reader). Thus it seems that eq. (5) fails 
■his example. But we have overlooked an important feature: the ele- 
lt of area of the circular membrane is not dp , but 2 irpdp. And now it 
be found that 



(kmp)Jo(k n p)pdp 


Cn&m,n 


( 8 - 21 ) 


As the present problem shows, specification of a range of integration is 
sufficient in defining orthogonality of functions; it is also necessary to 
te the weighting factor associated with each differential range of the co- 
inate. In the problem of the vibrating string, the weighting factor w ( x ) 
opened to be unity; here it is w(p) = p. In the next example it will be 
n to be p~. The same w which appears in the orthogonality relation will 
5 occur in the integrals defining expansion coefficients (cf. eq. 42). 

To prove eq. (21) for m n we use the last of the formulas in sec. 3.9, 
ording to which the left-hand side has the value 0 because both Joiha) 
1 J 0 (k 2 a ) vanish. According to another formula in this list, 

a 2 

UoMfpdp = - — J-i(k n a)Ji(h n a) 

0 * 


t in view of eq. (3-69), J_ 3 = - J u so that the constant c n in (21) has 
: value (a 2 / 2) [J\{k n a)] 2 . 

The question of the completeness of the functions Jo(k n p), i.e., the pos~ 
ility of the expansion 

co 

/(p) = T,a n J 0 (k nP ) (8-22) 

n«l 

1 be investigated in sec. 8. We shall here anticipate completeness pro- 
led, of course, that /(p) vanishes also at p = a. Granting this, the co- 
cients a n may be computed in the manner already illustrated in connec- 
n with Fourier series: 

Multiply both sides of (22) by Jo(k m p)pdp and integrate. The result 
again in view of (21), 

X a a 2 

/(p)^o(fcmP)p^P = a m ■ — [Ji(k m a)] 2 

we use the normalized function S„ = {y^2/a)[J i (k n a)]~ 1 Jo(k n p), the 
pansion reads 


/(p) = Zan*Sn(p) 


8.3 


EIGENVALUES AND EIGENFUNCTIONS 


256 


and the coefficients are 

a n 


f. 


f(p)Sn(p)pdp 


The problem of the circular membrane has been simplified by our 
assumption of circular symmetry. One may wonder what happens if 
types of vibrations are permitted in which the displacement is a function of 
both p and <p, for these certainly occur. It is then necessary to use the func- 
tion Sk, m defined in eq. (18). These may easily be seen to be orthogonal 
with respect to both indices, i.e., 


i (k n .p)e 


(k n n)e^d<p - 


Moreover it is possible to expand 

= 'La nm Jrr.{Kp)e im ' p 

n,m 


The details of this development may be left as an exercise to the interested 
reader; they are worked out fully in some works on sound. 8 

The condition /(a) = 0, upon which the expansion (22) was based, 
may be removed; the range of integration must then be extended from 0 to 
co . Now it is clear that, as a — > oo , the values k n move closer and closer 
together. In the limit they will, in fact, form a continuum. When the 
Dassage to this limit is performed, eq. (22) becomes what is known as a 
Fourier- Bessel integral , 9 an equation which is useful in the theory of radia- 
:ion. While the transition to the limit is difficult, the result may be 
Dbtained quite simply by a method used by Stratton, 10 which will here be 
*iven. 

Suppose f(x,y) can be expanded according to eq. (17). In these 
equations, we transform the variables of integrations to polar form: 

x — p cos ip, y = p sin <p; ki = k cos a, k 2 ~ k sin a 
They then read 

f(p,<p) = f kdk f g(k,a)e ikpco,i,p ~ a) da (8-23a) 

= Pdp (8-23b) 

8 See particularly Morse, P. M., “ Vibration and Sound,” McGraw-Hill Book Co., 
1936, p. 153 et seq. 

9 We are here following a terminology which seems to be gaining ground, although we 


257 VIBRATING CIRCULAR MEMBRANE; FOURIER -BESSEL TRANSFORMS 8.3 


We now take for f(p,<p) the special function f(p)e im<fl . The integration over 
appearing in (23b) may then be performed with the use of ea. (3-72a), 
according to which 


/ 

Jo 


p i[m<P—kpcoB (<p~~c 0 1 


dip 


« 

0 


i{m<p-kosin(<p-a+ r/2)] 


dip 


= e •* 


J i»2ir 

* 

0 


im(ot—vf2) . I AipsinO] ^ __ c^^im(a—Trl 


Jo 


cos [m0 — /cp sin 


- 27r6 iw(a " T/2) • J m (fcp) 


Thus 


0 (£,<*) = r f(p)Jm(kp)pdp • W2) s= gr(ft) * e im(a W2) 

«t/o 

On putting this answer into (23a) we find 

f(p)e imV = ~ f g(k)kdk f e fl»(«-r/2)+to»o.(*^ ffl )i ia 
2?r J 0 Jo 

-2. f g(k)kdk-2re imV J m (kp) 

JiTT J o 

These results may be expressed in the symmetrical form 

/(p) = f g(fc)J m (kp)kdk 
Jo 

g(k) = f f(p)J m (kp)pdp 
Jo 


(8-24b) 


(8-24a) 


(8-24a) 

(8~24b) 


The functions / and g satisfying relations (24a, b) are said to be a pair of 
Fourier-Bessel transforms. It is to be noted that the expansion (24a) of 
the function /(p) holds for every value of the integer m. Eq. (22), there- 
fore, is a special case of a Fourier-Bessel expansion. 


Problem a. Show, using the formulas of sec. 3.9, that the Fourier-Bessel transform of 
f(p) = p r , with respect to J n , is 


9(k) = 


y+ . r (i±- r + 1) 


r 



k~ r ~ 2 


Problem b. Verify the identity 



8.4 


EIGENVALUES AND EIGENFUNCTIONS 


258 


8.4. Vibrating Sphere with Fixed Surface. — The problem of a sphere 
vibrating with a node at its surface is of little interest in acoustics, for if 
there is never any displacement at the surface, the sphere cannot radiate. 
However, the same problem, interpreted quantum-mechanically, describes 
the motion of a particle within a spherical cavity and has as such enjoyed 
some attention in nuclear physics. For the sake of simplicity, we shall 
here maintain the acoustic interpretation. 

The solution, S, of the space part of the wave equation was shown in 
eq. (7-43) et seq. to be of the form 

Sk = IZckiYi (d,<p)r 1,2 Z l+l io(kr) (8-25) 

1=0 

As usual, k determines the frequency of the vibration: v = kv/2ir , v being 
the velocity of the waves inside the spherical medium. Eigenvalues in k, 
and hence in the frequency “ spectrum/’ are induced by the boundary 
condition 

Sk = 0 at r = a, the radius of the sphere 

According to (25), this is satisfied only if Zi+i/ 2 (ka) = 0. Thus, for every 
integer l , there exists an infinite sequence of hi such that jfc t -a is a root of 
Z 1 + 112 - But Z i+n 2 is a linear combination of J 1 + 1/2 and /_£_ 1 / 2 , of which 
only the former can be retained because r “ 1/2 j_ z _ 1/2 is always infinite at 
r = 0 and does not, therefore, represent a possible mode of vibration. 
Hence it is 

Ji+i/ 2 (ka) = 0 

which determines the eigenvalues of k. 

WhenZ = 0 the situation is very simple indeed, for J’ 1/2 (x) = V 2/ivxmix. 
Thus the k’s are determined by sin ( ka ) = 0, which means that for this case 

7l7T 

ko t n — — j n an integer 
a 

The frequency spectrum is much the same as for the vibrating string. Let 
us now see what is the physical meaning of the condition l = 0. A glance 
at eq. (25) shows that Yi(8,<p) is a constant, and this means there are no 
radial nodes. The sphere vibrates in spherical symmetry. 

In addition to these eigenvalues, which have a linear distribution, there 
are the other sets given by J 1 + 112 (^ 1 ^) = 0. These are irregularly dis- 
tributed and interspersed between the k 0n . 

The orthogonality of the S k (eq. 25) is at once evident. Orthogonality 


LAPLACE AND RELATED TRANSFORMATIONS 


8.5 


volume element contains this factor. Thus 

Sk t S k / 2 dr °c f Ji + ii 2 (kir)Ji + ii 2 (k 2 r)rdr 
J 0 

xpression which vanishes unless k x = k 2 as is seen from the last formula 
ic. 3.9. 

3y more special considerations it may also le shown that the set of 
:tions is complete in the sense that any f(r,9,<p) which vanishes 
• = a and is piecewise continuous can be expanded in the form 
£c/ ? / Cr r“ 1/2 J/ + i/ 2 (A:/, n r). For the special case l = 0 this expansion 

n 

ices to a Fourier series. 

Problem. Compute the lowest 12 eigenfrequencies of the vibrating sphere. 

1.6. Laplace and Related Transformations. — A Laplace transformation 
le function F(t) is 

( ™ F (t)e~* i dt = /(s) (8-26) 

J o 

function /(s) is the Laplace transform of F(t). Symbolically, we write 

f(s) (8-27) 

a eq. (26), we put t - —In z, we get 

/($) = — J^4>(,z)z*~~ l dz = J <t> (z)z s ~ l dz 

; is called a Mc.ilin transformation . 
s = —ix, eq. (26) reads 

f(-iz) = f °° F (t)e ixt dt 
Jo 

epresents a Fourier transformation of the function f( — ix). In eq. 
.3') we have encountered a formula very similar to this, except that 

e the transformation was “ two-sided/ 7 or bilateral, i.e., the integral 

extended from — oo to + <*> , and the function was called f(x) instead 
[ — ix). In this section we limit our study to one-sided Laplace trans- 
formations, which are the ones usually en- 
countered in practice. 

We shall now derive a formula expressing 
F(t) in terms of /($), i.e., a formula which 
represents the inversion of eq. (26). 

By Cauchy's integral theorem, eq. (3-1), 




! 


EIGENVALUES AND EIGENFUNCTIONS 


260 


ppose f(z) is analytic to the right of the line x = y (see Fig. 1). We 
i then distort the contour C and integrate from y + ico to y - i<x> f 
mce to the right to oo — x'oo , up to a> -f ice and back to y + zoo. 
dy the part from y + i °° to y — i «> contributes to the integral. Hence, 

„ ) ^ _i_ r-^mdz = j_ n+^mdz 

2iri Z — S 27 TlJy-i<=o S — Z 

sarly, y must be smaller than the real part of s, in symbols: y < R(s). 

• the last equation we apply the inverse operator iS” 1 , understanding that 


it we now show that <£' 


■-I 


l py +»*> / 1 \ 

£- 1 /(s) = — / f(z)dz£-'{ ) 

2irl Jy - i oo \5 — Z/ 

— e zt . We have 

£(e zt ) = f 

Jq 

if R(s) > R(z) 


(8-28) 


s — z 


1 


5 — 2: 

This condition is satisfied in eq. (28). Hence, 


F(t) - f y ^ 1 °° }(z)e 2t dz = f y+l °°f(s)e st ds (8-29) 

ZlTltSy—ioo 


Here y is any real number such that, to the right of R(z) = y (a vertical 
e through y), the function /(z) is analytic. Equations (26) and (29) 
)resent a Laplace transformation of F(t) and its inverse. F(t) and /(s) 
5 said to be a pair of Laplace transforms. 

When the function F(t) is changed to some other function, /($) will 
ewise undergo a change. It is useful to study such correlated changes. 

Suppose /($) = £[F (0] 

>w Cl operate ” on the function F(t) with some operator P, converting it 
,o PF(t). The transform of this function may be called pf(s), so that 

= st[PF(t)] 

e want to know what operator p corresponds to P. 

(1) Let P be a linear substitution : 

PF(t) = F(at - 6), a > 0, b > 0 


vanishes for negative arguments , 

—bsja 


— os I a /» co 

£[F(at - b )] = / F(r)e- arla dr, 

a VO 


£[F(a< - 6)] 


e 


, — 6s/a 




(8-30) 


3 operator p which corresponds to our linear substitution is: multi- 

e~ 6s ^ 

:ation by and substitution ot - for s. 

a a 


dL 


(2) Integration. Let P = I < 

Jo 

We wish to find f dte~ st f F(r)dr = f dte~~ s %(t) where = F(t). 
Jo Jo Jo dt 

egrate by parts, obtaining 

i 1 00 i r 00 

--^-“1 +- I F(t)e~ si dt 

S |0 S */0 

3 integrated part vanishes ; hence 

To integration, there corresponds division by s. Also, for iterated 
igration, one finds by repeated application of this formula 


(j fV)V(r) = s- n £[F] 


(8-31) 


(3) Differentiation. 

J r ” dF " r • 

f = «r at F Hr s / 

o di o Jo 

= s£[F] - F(0) = s/(s) - F(0) 


S ‘FA 


(8-32) 


vided F(0) is the value of F at t = 0. 

(4) Convolution. If Fi and F 2 are both functions of t, the integral 


jf F 1 (r)F 2 (t - r')dr 


(8-33) 


ften denoted by F x * F 2 and called the convolution or Faltung of F x and 
It is of frequent occurrence in physical problems. Suppose, for 
ance, that an error, is the linear result of two individual errors, 


€ == €i + € 2 , and that we know the distributions, or probabilities of €1 and 
<s 2 . These are Wi(ei) and w 2 (e 2 ). The distribution of e is then clearly 


w(t) = / / Wi 0i)w 2 (e 2 )deid€ 2 = I w 1 (e l )w 2 (e - e^dei = Wi 

«/ t/ei + <2 ==€ «/ 


^2 


The German word “ Faltung ” means folding; it arises from the follow- 
ing simple fact. If a line of length t be folded back in the middle, as in 
Fig. 8-2, the points adjacent to each other on the two segments are those 
which lie, respectively, at distances r and t — r from the origin 0. These, 

0 ? 

; ; : ) 

t t-t 

Fig. 8-2 


however, are the arguments of the functions F 1 and F 2 that occur in the 
convolution integral. One final comment regarding this integral : If F 2 is 
defined to be zero for negative arguments, the upper limit, instead of being 
t , can be taken to be « . 

The Laplace transform of a convolution is very easy to compute. 

£[Fi * ^2] = rate-* f 1 F x (r )F 2 {t - r)dT 

*S0 0 

= f~dr' £dTe~’< T+r ' ) F l (t)F 2 (/) 

= jf F 2 (r')e- 8r 'dr' f*Fr (r)e~* r dr 

if r 2 (0 = 0 for t < 0. 

Hence, 

£[F 1 *F 2 ]=f 1 f 2 (8-34) 

The transform of the convolution is the product of the transforms of Fi 
and F 2 . Note also that convolution is commutative, 

Fi * F 2 = F 2 * Fi 

and associative, 

(Fi * F a ) *F 3 = Fi* (F 2 * F 3 ) 

(5) Multiplication. To find 

£\m)-Fa(t)\ 

we consider the triple integral 


263 USE OF TRANSFORMS IN SOLVING DIFFERENTIAL EQUATIONS 8.6 


vhich is, by definition, — I dzf x (s — iz)f 2 (iz )• On integrating over z } 

2t — oo 

1 Z* 00 

ihere results — f e tz ^ h) dz = the 5-function defined in (8-15 / ). 

27T «/— oo 

Sence the integral becomes 

f f F 2 {t 2 )b(ti,t 2 )dt 2 = f e~ atl F i{t\)F 2 (h)dti 

0 *A) t/o 

Therefore 

£(FiF,) = ^ f“ dzfi(s - «)/ 3 («) 

Z7T t/ — « 

If the variable is changed from iz to 2 , 

£[^ 2 ] = f“ dz h ( 2 ) (8-35) 

r/ie transform of a 'product is a convolution along the imaginary axis . 

For Fourier transformations, we have 

3 r [F 1 ]= fr*‘F l( 0*, JF[F 2 ]« e^Frft 

p f'dzhis- z)f 2 {z) = f f m dz f'e-v—^Ftih) f°° e^F 2 {t 2 )dh 

57 r%/— » Z7T — 00 «7 — oo «/— co 

Che integration over 2 yields 5(^^ 2 ) ; hence 

hf-« dzU{s ~ z)h(z) = Sfs utF ^ F ^ dt = 

rhe Fourier transform of a product is times the ordinary convolution: 

2tt 

Ify(F 1 )=/ 1 , y(F 2 )=/ 2 , then 

7 (F x F a ) = ~f_ [dzfx (s - *)/ 2 (*) = ^ A * /, (8-36) 

8.6. Use of Transforms in Solving Differential Equations. — A. Consider 
lie differential equation 

7" + & 2 7 = 0 (8-37) 

vhere the primes denote differentiation with respect to 


JLJJL VJ T /xuw. 


Now by eq. (32), 


£(F") = s£(F') - F'(0) 
£(F') = s£(Y) - F (0) 


hence, 


£(F") - $ 2 £(F) - F'(0) - sF(0) 

If we write y(s) for £(F), eq. (37) becomes 

s 2 y - F'( 0) - sF (0) + A; 2 !/ = 0 

Note how nicely the initial conditions on F and on Y f introduce t] 
into the calculation ! 

On solving, we obtain 


V = 


F'(0) + sY (0) a! 


k 2 + s 2 


— + 


a 2 


s + ik s — ik 


where 
By eq. (29) 


+ a»-i[K(0) 

| f*y+i 00 

Y(t) = — / y(s)e*‘& 

Li n V7 — t oo 


iY 


Now 

27ri «/y — » oo 5 + ik 2iri 


1 r»7+i« 

27fi V7 — t o 


— 

\ir i Jy-i 


'7-f-l oo 


S 

1 /»7 + ** p* n 

= ~ / V d*' 

27T2 Jy-i 00 S 


ds 7 (on putting s / = 


This integral can be evaluated by the method of residues (see 
Since y must be positive in order that the singularity at s' = 0 b< 
we integrate along the square drawn in Fig. 3. The extension o 


[ i oo 

i 

i 


v 0. n 




% 00 


Fig. 8-3 


to close the contour is harmless, for the added parts contribute ] 


265 USE OF TRANSFORMS IN SOLVING DIFFERENTIAL EQUATIONS 8.6 


equals 1. Hence, 

j_ r +i “ = e - ik t 

2iri s -f- ik 

The other part of F, coming from — > yields e ikl . Thus 

s — ik 

Y = a x e~ iki + a 2 e ikl (8-38) 

The reader can easily verify that this is a solution, indeed the solution which 
satisfies the initial conditions. 

B. Consider the inhomogeneous equation 

Y" + k 2 Y = F(t) (8-39) 

This leads to s 2 y — F'(0) — sF(0) + k 2 y = f(s) 

if / is the transform of F. 

Hence, 

_ f(s) + F'(0) + sY (0) 
tJ k 2 + s 2 


We thus obtain, in addition to the solution of the homogeneous equation 


j (g^ 

(38), a solution Y p whose transform is 75 7 • 

k + 

in tables. Otherwise we proceed as follows: Let 

1 


This can often be found 


m = f u 


k 2 + s 2 


2 ~~ 


Then, by theorem (34), F is the convolution of F x {t) and Fo(t). F l is the 
inhomogeneity in the differential equation (39). 

We have 


1 


2 

2k 


1 


1 


Hence 

*2(0 


s + ik s — ikj 


2k 2 tt i 


k 2 + s 2 

l r r g,<f * ^ I d 

iviJ-ioa L s + ik s — ikj 

Y„ = U * ^2 = f'F 1 (r) 

Jo 


2k 

sin k{t — r) 


[c~ ikt - 
dr 


o ikt 1 = 


sin kt 


Problem. Solve the equation, Y" +26F -ft try = /o sin at, by this method and 
compare the result with example a of sec. 2.8. 


8.6 


EIGENVALUES AND EIGENFUNCTIONS 


table 1 


m 

m 

i 

1 

s 

t‘, ( Rz > -1) 

T(z + l) 
s £_hl 

e at 

1 

s — a 

t b e at 

r( 6 + 1 ) 

(s — a) +1 

COS cot 

s 

S 2 +0, 2 

sin cot 

CO 

S 2 -f- CO 2 

cosh cot 

S 

S 2 — a> 2 

sinh cot 

CO 

s 2 — co 2 

S(t,r) 

e~ or 

m -\un>r 

CZ 

s 

(1 - e -t )r 1 

'" 0-0 

cos (xVi)/TrVt (x real) 

-I 2 / r - 

e — — / V tts 

4s / 

sin (xy/t)/v (a; real) 

2 e -^/4« 

2VVs 3/2 

Jo(t) 

(i + s 2 r i/2 

n/n(0; R(jl') > — 1 

a + s 2 r i,2 [(i + s 2 ) i/s - 

L n (t) (See sec. 3.11) 

s- n - 1 (s - 1)" 


More extensive tables of Laplace transforms may be found in G. Doetsc 
Transformation , Dover Publishing Co., 1943. 

See also: 

Ofl.rfilfi.W. TT S Qflfl .Toarrsar T C 1 . << J ~ f) /“'tl 


STURM-LIOUVILLE THEORY 


8.7 


■eys, H., “ Operational Methods in Mathematical Physics,” Cambridge Uni- 
versity Press, 1927. 

der, D., “ The Laplace Transform,” Princeton University Press, 1941. 

;nus, W. and Oberhettinger, F., “ Formulas and Theorems for the Special 
Functions of Mathematical Physics,” Chelsea Publishing Co., New York, 1949. 
This contains tables of Fourier, Laplace, Hankel, Mellin, and Gauss transforms. 
For Fourier transforms see: 

slaw, II. S., and Jaeger, J. C., “Operational Methods in Applied Mathematics,” 
Oxford Press, 1941. 

rchill, R. V., “ Fourier Series and Boundary Value Problems,” McGraw-Hill 
Book Co., New York, 1941. 

(don, I. N., “ Fourier Transforms,” McGraw-Hill Book Co., New York, 1951. 
hmarsh, E. C., “ Introduction to the Theory of Fourier Integrals,” Oxford 
Press, 1937. 

'Kir, N., “ The Fourier Integral and Certain of Its Applications,” Cambridge Uni- 
versity Press, 1933. 

3.7. Stiirm-Liouville Theory. — Deeper insight into the nature of 
n value problems which arise in connection with second order differen- 
equations is obtained from a study of a theory at once simple and 
itiful, the theory of the Sturm-Liouville equation. Nearly every 
nvalue problem encountered in physics and chemistry leads to an 
ition of the general form 


L(u) + \wu = 0 

(8-40) 

re the differential operator L is defined by 

L(u) 3s {pu'Y — qu 

(8-41) 


quantities p, q , and w are understood to be functions of the independent 
able x, and we shall suppose that w, which will soon be recognized as 
former weighting function, satisfies 

w{x) > 0 

ie entire range of the variable x. This range is different in different 
>lems, but it will be assumed to be finite and to extend from a to b. 
illy, X is a constant ; it will turn out to be the eigenvalue parameter. 
in operator 11 of the form (41) is said to be self-adjoint. The necessary 
sufficient condition for the general second order differential operator 

Diu) = fu n + gu + hu 

which /, g, and h are functions of x) to be self-adjoint is simply that 
f . Eq. (40), however, is not a very special one. Every second order 


. . Number of eq. 

e of equation F p(x) q(x) X w{x) in Chapter 2 


8 


EIGENVALUES AND EIGENFUNCTIONS 


268 



Of Sturm-Liouville type when written in the form 


'L-ii. UlCii. jU’V./.l CAi VVJJL XJ \ \Aj J V/O/ii u\j Xiiau^j UWi OiUJUiLUU , IV IJIV/V/UL VJiXlJ' KJU A1.1 <UA bl” 

9 ~f 


id from the left by exp 


r- 


f 


■ dx. Thus all differential equations 


ountered in Chapter 2 may be written in self-adjoint form, and the 
ory we are presenting applies to them all. In Table 2 we list the factor 
y which the equation named on the left, written in the customary form 
ivhich it appears in Chapter 2, must be multiplied in order to be self- 
oint, and also the quantities p, q, and w in (40). 

The function u is subject to boundary conditions. In the examples of 
preceding sections these were of different types : in the problem of the 
ng every u had to vanish at both end points, in the other problem it 
5 to be finite at r = 0 but zero at r = a. Examination of these and 
ny other examples (see Chapter II) will show that the boundary con- 
on in most problems of interest ma}' be expressed in the uniform way 


puu 


pun = 0 


usually either p or u or u vanishes at the end points of the range. But 
s equally satisfactory to state these conditions in a somewhat milder 
n: Let u and v be any permissible solutions of eq. (40) ; we then require 


vpu f 


\a 


vpu 

b 


(8-42) 


On the basis of this condition it is possible to establish the important 
orem : 


J /*6 f*b 

f vL(u)dx = / uL{v)dx 

a a 


(8-43) 


3 proof is straightforward : 


vL(u)dx 


J v(pu) f dx — J ' vqudx 



first term on the right vanishes because of (42) ; the second may be 


isformed by another partial integration into — vpu + f u{pv') f dx ) 

a ^ 

which the first vanishes also. But the remaining integral, 
u{pv r ) f — uqv]dx, is nothing other than J % uL(v)dx. The result (43) 


ften expressed by saying that the operator L is Hermitian with respect 
functions satisfying condition (42). The importance of Hermitian 
rators will be more evident in the next chapters. 


DM 


EIGEKVA.bU.Kb AIN 1? E1UEJN EU JN (JTIUIN S 


8.8. Variational Aspects of the Eigenvalue Problem. 12 — Be: 
ceeding further, the reader is advised to review the main points of C 
It will be shown that the Sturm-Liouville equation (40) is the E 
dition which the function u must satisfy in order that (1) the i 


f (pu r 


qu 2 )dx = A (u) 


take on a stationary value, (2) the function u be normalized: 


wirdx 


The proof is simple. In the notation of sec. 6.5 we have, o: 
Xj = —X for convenience, 

K = I — XI i = pu 2 + qu 2 — Xwu 2 

and the Euler equation (6-15) is 


d_K _±d_K 
du dx du' 


This is clearly identical with (40). The eigenvalue X here plays t 
a Lagrangian multiplier. We have thus seen that the process ( 
the Sturm-Liouville equation is tantamount to a search for those 
u(x) which maximize or minimize A, subject to condition (45). 
dition is important, for the integral A has usually only a single s 
value; but when eq. (45) is imposed A has numerous values each 
is stationary for a given neighborhood of functions u(x), although 
only one of them is an absolute minimum or maximum. 

Example. Let us see whether the procedure here outlined wil 
lead to a simple type of function defined by a Sturm-Liouville equ; 
the Legendre polynomial. We start by assuming 

u — a + bx + cx 2 

with a, b, and c unknown. From Table 1 we see that p = 1 — ; 

A = j* (1 — x 2 )(b 2 + ibex + 4 c 2 x 2 )dx = %b 2 + yfc 2 
We require that 

J* u 2 dx = 2a 2 + §(5 2 + 2ac) + -fc 2 = 1 

12 The development in this and the following sections leans heavily o: 
Hilbert, “ Methoden der Mathematischen Physik,” Vol. I, Second Edition. 


71 


VARIATIONAL ASPECTS OF THE EIGENVALUE PROBLEM 


8. 8 


hus it is necessary to minimize 

•f6 2 + T^° 2 ~~ M^a 2 + -§-(6 2 + 2ac) + -fc 2 ] 

y choice of a, 6, and c. On putting the partial derivatives with respect to 
6, and c equal to zero and finally rewriting the normalization condition, 
>ur equations are obtained for the determination of the quantities a , 6, c, 
id X: 


x ( a + 0 = o 

a) 

6(2 - X) = 0 

(2) 

c( 8 - 3X) - 5a\ = 0 

2 

( 3 ) 

a 2 + -^(6 2 + 2ac) + -• = -k 

5 

( 4 ) 


ippose we put c = 0. Then, according to (1) and (3), a\ = 0, while (4) 
elds a relation between a and b. Hence we can put either a = 0 or 
= 0. In the latter instance, i.e., if 

X - 0 

e get from (2), 6 = 0, and from (4), a = In the former instance, 

imely a = 0, (2) yields 

X = 2 

id (4) gives 6 = V|. 

Now instead of assuming c = 0, let us take 6 = 0. Consistency then 
quires that neither a nor c nor X can be zero. Hence we find from (1), 
= —3a, and from (3) and (4), 

X = 6 

id a = Vf. We have thus determined altogether three solutions, corre- 
londing to three possible values of X : 

X u 

o %4 

2 Vfx 

6 Vf(l - 3x 2 ) 

he reader will notice that the X’s are the first three values of 1(1 + 1), and 


8.8 


EIGENVALUES AND EIGENFUNCTIONS 


let ui(x) be that function which produces the lowest mi 
satisfying (45), and let \ x be the eigenvalue correspondi 
seek a function u 2 (x) which will also produce a minimu: 
(45), but which, in addition, shall be orthogonal to u x : 

f wu\Uodx — 0 


The Euler equation for u 2 is more complicated than tl 
must satisfy two accessory conditions and u x only one. 

K — pu 2 2 + qni ~ X2^Wo — fJLWUiUo 


/j, being a new Lagrangian multiplier. Hence eq. (46) h 
2qu 2 — 2\ 2 wu 2 — fiwui ~ 2(; pu 2 ) f = C 
and this is identical with 

L(Uo ) + X 2 1^2/ 2 + JfJLWUx = 0 


To determine the value of ju we multiply this equati* 
grate, making use of relation (43) which, of course, we 
to obey. The result is 


J % U 2 L(u\)dx + X 2 J wu\u 2 dx + J wu 2 d 


Here the first term is — \\ 


i 


wu\u 2 dx because U\ satis 


equals zero because of (47). For the latter reason, tl 
(48) also vanishes. But the integral appearing in the 
tainly finite. Hence, we conclude that the multiplier ^ 
well not have required relation (47): u 2 satisfies the sai 
but for a different eigenvalue \ 2 . Moreover, it is autom 

to U\. 

This process may be continued. Suppose we seek a 
will minimize A, subject to the three conditions 



J WUiUtflx = J* 


wn 2 Usdx 


The minimum thus obtained will lie at least as high as th 
for the choice of functions has been further restricted, 
appearing in Euler’s equation now contains three undeter 
X 3 , fij and v. The last two of these may be shown to vi 


nalized, (3) are mutually orthogonal, they are found as solutions of 
(40). 

Conversely, it is easy to show that all solutions of (40) belonging to 
rent eigenvalues are orthogonal. To do this, one need only multiply 
specific forms of (40) : 

L(u{) + \jw%Li = 0, L(uj) + \jWUj — 0 

ij and U{ respectively, integrate each equation and subtract. When 
is used, the result is simply 

(A* — Ay) f witiUjdx = 0 (8-49) 


ce, either A* = Ay, or m and Uj are orthogonal. 

Fhe case in which Ay = Ay, where two (or more) eigenfunctions belong 
le same eigenvalue, is not of very great interest under the simple con- 
>ns we are here considering (real eigenfunctions, one independent vari- 
). It is very much more important in the more general eigenvalue 
Jems of Chapter 11. As to terminology, whenever several eigen- 
tions, i.e., linearly independent eigenfunctions, are associated with one 
lvalue, that eigenvalue is said to be degenerate . 

t may seem strange to find eq. (49) predicting orthogonality only for 
degenerate cases, while the variational argument of the preceding para- 
hs implies no restriction of this sort. Harmony is restored when we 
ze that a set of linearly independent solutions of eq. (40) can always be 
bined in such a way as to form an equally numerous, equivalent set of 
ogonal solutions. (Cf., for instance, the method of Schmidt, 14 sec. 
). Hence we may, if we- like, speak of the orthogonality of all solu- 
5 of eq. (40), assuming tacitly that the process of orthogonalization 
been carried out on all sets of functions belonging to a degenerate 
lvalue. 

)ne further point is to be made in connection with the variational 
>erty of the solutions of the Sturm-Liouville equation. We have seen 
the Ui minimize the integral A. What are the stationary values of A 
produced? Let us compute them. 


= J (■ pul 2 + qu 2 i)dx = uipu'i — j'[ui(pu' i )' — Uiqu t ]dx 


= — J U{L(ui)dx = A i J wu{dx = A; 


(8-51) 


simple and interesting answer is, then, that the stationary values of A 
die eigenvalues A;. 

* The use of this method for functions instead of vectors is illustrated in Lindsay 
Vlareenau. “ Foundations of Phvsics.” John Wilev and Sons. d. 425. 


Example. Degeneracy arises when, in the vibrating string problem 
expressed by eq. (2), one replaces the ordinary boundary conditions (a) 
and (b), sec. 2, by one requiring only periodicity : 

5(a) = 5(6) (8-50) 

The eigenvalue parameter, X, in this equation is k 2 . Moreover, it is to be 
noted that the periodicity condition (50) conforms to our general require- 
ment (42). The solution satisfying (50) is easily seen to be 

„ . . / 2 vn 

S = A sm ( 8 4 — —x 

where l = b — a, and 8 is arbitrary, the quantity k taking on the values 
2im/l. But n may be a positive or a negative integer. Hence to the 
same value of k 2 , namely 47r 2 n 2 /Z 2 , there correspond the two functions 


5i = Ai sin 


(• 



$2 = 


A 2 sin 


( s 


2 im \ 

TV 


Except when <5 is an integral multiple of v, as it must be when the ordinary 
boundary condition is imposed, S\ and S 2 are linearly independent. Yet 
they are not orthogonal (except in the special case when 8 = tt/ 4). It is 
easily seen, however, that if we put 



wherein a = sin 2 8 — cos 2 8, we have a pair of functions, satisfying the 
differential equation for the same A: 2 , which are both orthogonal and normal. 

Problem. The integral A (u) for the differential equation u"' + k 2 u = 0 is / ( u) 2 dx . 

Jo 

Assume for u any normalized polynomial containing the factors x and x — Z, and show 
that A computed for this u is greater than the lowest eigenvalue ir 2 l 2 . 


8.9. Distribution of High Eigenvalues. — Preceding considerations indi- 
cate no uniform law according to which the eigenvalues of any differential 
equation are arranged; regularity does, however, prevail for the “high 17 
eigenvalues, as will now be shown. Let all X’s be arranged in numerical 



DISTRIBUTION OF HIGH EIGENVALUES 


strongly suggests 16 and the examples confirm this expectation. We 
now prove the theorem : 

lim X n = const, n 2 (8-52) 


the substitutions 


z = (pw) ll4 u, t 


f(;)‘ 


Sturm-Liouville equation takes the form 
d 2 z 

~^2 ~ /( t)z + X 2 = 0 


(8-53) 


died consideration which may be left to the reader shows that the 
tion /(£) is bounded. 

sfbw consider, in place of (53), the differential equation 


S + ^-O 


(8-53') 


iigenvalues are minima of 


dt. where r is the value of t at 


r b AA 1 

b , namely r = J J dx. On the other hand, the eigenvalues of 
are the minima of 

* 

X = minimum of J* + jfe 2 J dt 


X = minimum 


X' s minimum of 


Assume that z is the specific function which produces the minimum 
reas z produces X. If we compute X using z , we shall obtain a value 
.he integral that is greater than its minimum, X. Hence, 

since / is bounded and z is normalized, J fz 2 dt has some finite value 

5 Suppose u n produces the minimum X n . Of the function u n+ i we require that it be 
>gonal not only to the n — 1 functions with respect to which u n has this property, 
ilso to u n itself. Hence the class of functions from which u n +i must be chosen is 


F \ so that 


A < X' + F f 


If we proceed in the reverse manner and use z in computing X r , we obtain 


X' < 




where F is again finite. Upon combining the last two inequalities we find 

X' + F < X < X 7 + F f 

which means that X can differ from X' by only a finite amount. If the X'- 
values tend to o° ? the X*s also do. 

But the eigenvalues of (53 ; ) are well known. They depend, of course, on 
the boundary conditions for z, and hence for u. If u vanishes at both a 
and b } so that z vanishes at 0 and r, the eigenvalues are 


In case only periodicity of z is required (see example of preceding section), 
the eigenvalues are 


In any case, 


/ * 

An — 2 

T 

X' = const, n 2 


Since the “ high ” eigenvalues X approach the “ high values of X ; , theorem 
(52) is established. It is to be observed that our result in this particular 
farm is conditional upon the assumption of a finite r, which is usually equiva- 
lent to a finite range of x. Several of the equations listed in Table 2 are 
ordinarily treated for infinite ranges of the independent variable; for 
these, theorem (52) is not valid because r becomes °° . Hermite’s equa- 
tion is a case in point: its eigenvalues are proportional to n rather than 
n 2 even asymptotically. But here, as well as in all other cases, it is still 
true that 

X n — > 00 as n — > 00 ( 8 - 54 ) 

It is interesting to note that the solutions of eq. (53 r ) are asymptotically 
(for large X) equal to those of (53). Thus 


.. . . ut 

lim 2 n = A n sin — t 

n — *- 00 T 

provided the boundary condition is: z(0) = z(r) = 0. In terms of u this 
reads 


8.10. Completeness of Eigenfunctions.— In sec. 2 there appeared 
qualitative, although crude, definition of completeness. We now wish t 
give that definition greatei precision and to prove it under the condition 
outlined in sec. 8.7. A system of functions u u u 2 , ■ ■ ■ is complete if it i 
possible to “ approximate in the mean ” any function f(x), satisfying th 


same boundary conditions as the u’s, by means of a series £ that is, 


lim I [f — 22 ciiUi ) wdx = 0 

n— oc <l/ \ 1 J 


(8-55 


We are here concerned with functions u which are solutions of eq. (40 ] 
hence we know them to be orthogonal. This permits at once the determ 
nation of the coefficients a*. If, for any given, finite n, we wish to make tt 
quantity 

N = J * (/ — Z) a{iii) 2 wdx 

as small as possible, then 

dN „ , 

— = 0 for j = 1, 2, - • •, n 

ddj 


The differentiations may be carried out under the integral sign, so that 


whence 



CLj = J fUjWdx 


( 8 — 5 ( 


This, then, is the best choice of coefficients with which we may hope 1 
satisfy (55). 

Now introduce the following abbreviations 


n 

~ f Z) Q'i'Mi) 
1 



We shall show that the function A Jc n has the following properties: 

(1) it is normalized, (2) it is orthogonal to every Ui up to and includir 
u n . The first property is obvious; the second is easily seen as follows: 

— ) u{wdx — f fu t wdx — ZI Qj f UjUjWdx 

C n / C n (ts j — l 


— (a, — a,i) if i <n 

Cn 



— ( n . __ ft\ if • n 


But if A n /c n has these two properties, it satisfies all the conditions which, in 
the variational procedure, we imposed upon except that of minimizing 
A. Hence it is clear that 

A > A (u„ +1 ) 

and this means : 

-4 A(A n ) >X n+1 (8-57) 

Cn 


7 he remainder of our argument consists in proving that A(A n ) is finite. If 
the reader will accept this fact, 16 which is almost obvious from the meaning 
of A n , the last inequality leads at once to (55) ; for as n approaches infinity, 
the right-hand side tends to infinity in view of (54), hence 

lim <? n = 0 

n — ► oo 

This is the same as (55). 

16 For the more exacting reader, we here indicate the proof. The integral A(A n ) 
may be transformed in accordance with the first three steps of (51) into 


But 


because 


Hence, 


A (A n ) = - J Ajj(& n )d£ 

jA n L(A„)dx =f if~ Zwi)L(J - Z W)dx = J ' }L(J)dx + 
L(ui) — and fuiL(J)dx = f ' JL{ui)dx = — aiki 


A(A n ) = MS) - Z 
1=1 


The existence of A (J) must be assumed, for otherwise an expansion of / in terms of the 
u’s may be impossible. Moreover, / and therefore the approximating function 

71 

w-n. = Ti aiUi must possess integrable squares. Let us suppose that 

i = i 


/ 


(piledx - a? = Af n 


i=i 


If we add zero in the form £a?Xi — Af„Xi, where Xi is the lowest of all eigenvalues, to 
l 

the last expression for A (A n ) we obtain 


A (A n ) 


n 


= A(/) — £af(X,' - Xi) - M n \i 

l 


The difference A (/) — M n Xi is certainly finite for all n. Let us call it A. The summa- 
tion on the right consists of positive terms only. Inequality (57) may therefore cer- 
tainly be written 


A 


8.11. Further Comments and Generalizations. — In the last section we 
have shown, not that 


OO 

/ = Z (8-58) 

«'=i 

but rather that the series on the right approximates / “ in the mean ” in 
accordance with eq. (55). To put the difference more concretely: Eq. (55) 
may be true and yet (58) may not hold for all points of the range 
(a <£<£>)• It is clear that if (58) is true almost everywhere but fails 
at a finite set of points, the contribution of these points to the integral in 
(55) would be nil and that equation would be true. To prove (58) in 
addition to (55) would involve the establishment of absolute and uniform 

convergence of the series Z For the solutions of eq. (40) with 

1 

boundary conditions of the type here chosen this can indeed be done, 17 and 
the reader need not be excessively concerned over the difference between 
“ completeness ” (expressed by eq. 55) and the possibility of expansion of 
an arbitrary function (indicated by 58). 

The preceding theory has always involved the assumption of a finite 
range, b~a, of the independent variable. This is clearly a serious limita- 
tion, for it excludes the usual solutions of a number of the equations listed 
in Table 2. To develop a rigorous account of the situation arising when 
the range is extended to infinity is not easy, but what happens qualitatively 
under such conditions can be readily seen. 

Consider again the vibrating string with eigenvalues k 2 = u 2 t 2 /1 2 . 
As l tends to infinity, these eigenvalues move closer together until in the 
limit they form a continuum. The eigenfunctions are still of the form 
A sin (kx + 5), but they refuse to be normalized in the former sense; for 

clearly the integral J sin 2 kxdx, *when taken over an infinite range, 

diverges. Also, since the eigenvalues are no longer discrete, our definition 
of orthogonality loses its sense. However, completeness is still guaranteed 
since what was originally a Fourier series will now become a Fourier inte- 
gral (cf. eq. 14). The difficulty concerning orthogonality and normaliza- 
tion can, however, be avoided by introducing “ eigendifferentials ” instead 
of eigenfunctions. 18 

The situation brought about by an extension of the range may be even 
more complicated than this. We shall see in Chapter 11 that the differen- 
tial equation describing the hydrogen atom (eq. 11-55), which is closely 


related to Laguerre’s, admits, because of its infinite range, 1 
and a continuous set of eigenvalues (“ spectrum ”). This i 
of very frequent occurrence. On the other hand, eigenval 
with a Sturm-Liouville problem of infinite range are not n< 
tinuous, as the example of the simple harmonic oscillator (c 
or Hermite’s differential equation (eq. 2-62) clearly shows. 

No mention has thus far been made of the possibility tha 
of the Sturm-Liouville equation may possess singularities 
a < x < b. Troubles of this sort might have been circ 
postulating that the function p appearing in eq. (41) be 
sign and never zero, as is sometimes done in treatments of 
problem. This, however, would have excluded some interest 
Table 2, notably Legendre’s equation which has (non -esse: 
points at x = dt-A, and Hermite’s equation which has an e 
larity at <*> . Suffice it to say here that these matters, althou 
able fundamental interest, occasion no modification of the cc 
derived. Attention is given to them in Kemble’s book ( loc . 

The solutions of eq. (40) have been assumed to be 
throughout this section. If the functions p, q, and w are re 
no loss in generality. Suppose that a complex function 
were admitted as solution of the differential equation 19 ; this 
imply that both X and Y are real solutions belonging to tl 
value. Thus, whenever complex solutions arise and are co 
the boundary conditions, we may at once conclude that tl 
ing eigenvalue is degenerate. (In the complex scheme, 
u* = X — iY are linearly independent solutions.) If now 
normalizing condition 

u*uwdx = 1 

we are merely postulating that, in place of the usual 
X 2 wdx — I* Y 2 wdx ~ 1 

X 2 wdx + J* Y 2 wdx — 1 

shall hold. In other words, we are operating, in the complex 
linear combinations of the real functions, and with a diffen 
tion. Orthogonality, if defined by eq. (5*) instead of (5), 
ordinary meaning, for 

f u*u 2 wdx = f [XiX 2 + Y\Y 2 + iX x Y 2 - iY x X 2 ]i 






FURTHER COMMENTS AND GENERALIZATIONS 


8.11 


a immediate consequence of the fact that eigenfunctions belonging to 
srent eigenvalues Xi and X 2 are orthogonal. Furthermore, if we require 
md u to be orthogonal, 

fu 2 wdx = JiX 2 -Y 2 + 2 iXY)wdx = J X 2 wdx - J Y 2 wdx = 0 


dded X and Y have been chosen orthogonal. Thus both X and Y are 
nalized to \ when the complex formalism is used. In view of these 
y\e facts the validity of the completeness proof remains intact 
complex / and complex u) only formal changes are necessary. 
; di become complex, and completeness is defined by the relation 



A*A n wdx = 0. 


Complete revision of the theory is necessary when 


coefficients p, q, w are permitted to be complex. 

Finally, it is appropriate to remark that our development has been 
ricted to one dimension. The Sturm-Liouville theory can be gener- 
3d without great difficulty to certain partial differential equations with 
ih the same results. For this generalization we refer the reader to 
irant-Hilbert. 

Eigenvalue problems arise in the most diverse fields of physics and 
nistry. Many of them are treated in : 


se, P. M., and Feshbach, H., “ Methods of Theoretical Physics,” McGraw-Hill, 
New York, 1953. 

ays, H., and Jeffreys, B. S., “ Methods of Mathematical Physics,” Cambridge 
University Press, Second Edition, 1950. 

See also the bibliography on quantum mechanics. 


CHAPTER 9 

MECHANICS OF MOLECULES 


9.1. Introduction. — As an illustration of the mathematical methods 
used in mechanics, we discuss in this chapter an important physical and 
chemical problem, namely, the motion of a molecule containing n atoms. 
We limit ourselves to this single topic for several reasons: its complexity 
requires us to describe most of the mathematics used in mechanics; the 
same methods may be extended to other problems, for example, the 
motions of particles within the atomic nucleus or the motions of a macro- 
scopic body such as an aeroplane 1 ; and finally because the structure of the 
polyatomic molecule and its spectra are matters of considerable interest to 
many chemists and physicists. This chapter will also present an oppor- 
tunity for dealing with the purely mathematical question of how to 
describe the configuration of a rigid body (Euler’s angles, etc.), a matter 
which is of great generality and must be included in a survey of mathemati- 
cal methods used in science. Many adequate accounts of classical mechan- 
ics 2 exist so that we do no more here than recall briefly some of the principles 
of that subject before proceeding to the special problem in which we are 
interested. 

9.2. General Principles of Classical Mechanics. — A free particle is one 
whose motion is completely unrestricted. It is said to have three degrees 
of freedom , for its position is uniquely determined at any instant by three 
independent coordinates. Consider a system containing n such particles, 
where the instantaneous position of the t-th particle of mass m { is specified 
by the vector r t . If F t is the vector resultant of all the forces acting upon 
the particle then the motion of the system is described by Newton's equa- 
tions which may be written in the form 

d 2 n 

m i = rmii = F t ; (i = 1, 2, • • •, n) (9~1) 

In many cases, the particles composing the system are not free but 
restricted. For example, a member of the system may be allowed to 

1 See Frazer, R. A., Duncan, W. J., and Collar, A. R., “ Elementary Matrices,” 


283 


GENERAL PRINCIPLES OP CLASSICAL MECHANICS 


9,2 


move only on a surface, so that its degrees of freedom become two. Under 
such circumstances the equation of the surface is called the constraint In a 
similar way if the particle is required to move along a line, there is only one 
degree of freedom and the two equations which define the line are the 
constraints. If the sum of the degrees of freedom of all the particles is 
k < 3n, then the system may be regarded as a collection of free particles 
subjected to 3 n — k independent constraints so that only k coordinates 
are needed to describe the motion of the system. These new coordinates 
q X) q 2 , • ' qic are related to the Cartesian coordinates of the particles (cf. 
eqs. 5-1 and 5-2); they are called the generalized coordinates of Lagrange . 

If, for convenience, we let the Cartesian components of be x 2 , s 3 ; 
the components of r 2 be x 4 , x 5j x Q and so on (remembering also that 
mi = m 2 = ra 3 ; m 4 = m 5 = m 6 ; etc.), then the kinetic energy T of the 
systeni is given by 

3n h k 

2T = E =12 (9-2) 

t=l r=ls=l 


where 



4 ^ dx i dx i 

Zl rrii 

i—l ^Qr dQs 


(9-3) 


Since the components of momentum in Cartesian coordinates are 


Pi — nzi±i — 


dT 

d±i 


we define, by analogy, the generalized momenta as 


PriqiQ 2 • • 




) = 


dT 
dq r 


k 


T, ArsQ* 

5 = 1 


(9-4) 


In many physical problems, the system is conservative, that is, a poten- 
tial function V ( qi,q 2 ,• * •,§*) exists such that 


Qi = 


dV _ 
dq> J 


(i = 1, 2, • • k ) 


(9-5) 


Then, as was shown in sec. 6.3 (cf. eq. 6-11), Lagrange’s equations of 
motion are 




dt \dqi. 


dqi 


(i — 1, 2, • • •, k ) 


(9-6) 


This is a set of k differential equations of second order with q u q 2 , • • •, qt 
as dependent variables and t as independent variable. 

T£ ^ — . 4- •» A si . . -4-Vyrt Tn r\nnn on no non ■fai'n s'fo finn 


9.3 


MECHANICS OF MOLECULES 


284 


eq. (5) becomes 

d ( dL\ dL 

The solution of Lagrange’s equations in either form (5) or (8) will result 
in an expression for each generalized coordinate g* as a function of time and 
2k constants of integration. The latter must be determined from the 
initial conditions of the n particles of the system. 

It is often of advantage to transform (5) or (8) to a set of 2k first ordei 
differential equations. From (4), (8), and the definition L = T — V, 
we have 


dL 


dL 


p i = TT ; Vi - — 

dqi dqi 

We now define the Hamiltonian function 


(9—9) 


H = Z v4i - L 

i=i 


Its total differential is 


k k k qJj k qJj 

dH = E p»dg« + E gidp; - E — dq, - E — 

1 = 1 i=l i=ldg; t=i dg t - 

But by using (9), the first and last terms cancel, giving 

& A: 

dH = E tftrfPi ~ E — 

1=1 1=1 dg t - 


(9-10) 


( 0 - 11 ) 


This equation depends only on dp t - and dg,- but not on dqi, hence H is a 
function of g and p alone and we may write 

k BH k BH 

dH = E — dp< + E — % (9-12) 

i=l dpi i=iag,- 

Comparison of (11) with (12) shows us that 


dH _ . BH 
dpi dg t - 


dL 

dgi 


-Pt-; (i = 1, 2, • • •, fe) 


(9-13) 


The resulting first order differential equations (13), 27c in number, are 
Hamilton's canonical equations of motion ; pi and g* are said to be canoni- 
cally conjugate variables. 

Problem. Show that 2 T = ^p%q% and H = T + V. 


VELOCITY, ANGULAR MOMENTUM, AND KINETIC ENERGY 


9.4 


l system of n particles bound together by interior forces in such a way 
b the distance between the i-th and j-th particles is constant and 
ffected by any external force to which the system is subjected. Sup- 
^ X{ 9 yiy Zi are the Cartesian coordinates of the i-th particle, then the 
ance between the i-th and j- th particle is 

= V (xi ~ Xj) 2 + (yi - yj) 2 + (zi — zj) 2 = constant (9-14) 
(i, j = 1,2,- *, n) 

It is readily shown that the most general displacement of a body of this 
; may be obtained in a variety of ways by a combination of translation 
rotation about an axis fixed in the body. The proof of this fact, known 
basics’s theorem , may be found, for example, in Whittaker, loc. cit. The 
ice of a reference point, that is, the origin of the vector which locates the 
d axis, is entirely arbitrary. For a given displacement, this point may 
ihosen in such a way that the translation is parallel to the axis of rota- 
l. With this choice of reference point, each displacement can be effected 
ne and only one way, the resulting motion being similar to the displace- 
lt of a nut on a threaded screw. It is thus only necessary to consider 
lslation and rotation in order to study the most general motion of a rigid 
y. It should be remembered, however, that the axis of rotation may 
continually changing its direction, hence we usually refer to an instan- 
eous axis of rotation. 

9.4. Velocity, Angular Momentum, and Kinetic Energy. — Suppose a 
d body is rotating about an axis with a constant angular velocity co; 
a the linear velocity of any point P in the body is given by 

v = to X r (9-15) 

ire r is a radius vector drawn to P from a fixed point 0 on the axis of 
ition (see eq. 4-16). If the point P has a mass m, its momentum is 

rav = m(<o X r) (9-16) 

i its moment of momentum or angular momentum (see sec. 4.5) about the 
at 0 is 

M = r X mv = m[r X (to X r)] (9-17) 

>pose the fixed point 0 about which the body is rotating is taken as the 
3 of a Cartesian coordinate system OXYZ , the components of co are 
0 ?^, to 2 and the components of r are x , y , z . Then in accordance with 
(4-13), the components of v are: 

= Za — Vco* 


9.5 


MECHANICS OF MOLECULES 


286 


and the components of M are: 

M x = m(yv s - zv y ) 

M y = m,(zv x — xv z ) (9-19) 

M z = m(xv y - yv x ) 

On combining (18) and (19) there results 

M x = — Fooy — Eca z 

My = Bwy — Dco z — Fo) x (9-20) 

M z — Cco 2 — E<a x — Du y 

vhere A> B } C are moments of inertia and D, E, F are products of inertia: 

A = m(y 2 + z 2 ); D = myz 
B = m,(z 2 + x 2 ); E = mzx (9-21) 

C = m(x 2 + y 2 ); F = mx?/ 

The kinetic energy J 7 of the particle at P is given by 
2 T = mv * (o> X r) = m[vcor] = m[corv] 

= nm • (r X v) = <o • M (9-22) 

where we have used eqs. (4-17), (4-18), and (9-17). Thus, in view of 
(20) we find 

2 T — A(J^ "f" Bwy -f- C<x> 2 — 2D<jOyO) z — 2E(j) z (jo x 2Fco x ixiy (9—23) 

9.6. The Eulerian Angles. — We digress here to give explicit relations 
useful for locating a point P in a rigid body. Six parameters are needed. 
Three of them will specify a fixed reference point in the body, which is not 
necessarily at the origin of the coordinate system as in the preceding dis- 
cussion. Two more parameters are required to define the position of a line 
fixed in the body and passing through the fixed point, while the sixth 
parameter defines a rotation of the body about this line. 

Suppose we attach a rigid framework O'X'Y'Z' to the body and 
denote the position of its origin relative to a coordinate system OXYZ 
fixed in space by x 0 , yo, to- We will also suppose that we know the nine 
direction cosines of O'X' Y f Z f relative to OXYZ. The point P may 
then be located in either coordinate system at will for we have the relations 
(see sec. 4.1) 

X = Xq + CLllX r + CL\2y f + UlsZ* 



re x,y,z refer to OXYZ and x ,y f % z refer to OX'Y'Z Let us choose 
as three of the parameters required. The nine direction cosines 
;h remain, and which we know are not linearly independent, may then 
ombined in a variety of ways in order to obtain the three additional 
pendent parameters needed. Some useful combinations are the Euler - 
rigues parameters , the Cayley -Klein parameters ,‘ i and the Eulerian 
es. The latter are suitable for the present purpose and will now be 
ribed. 

Unfortunately, the Euler angles have been defined in several different 
s in the scientific literature and great confusion occurs when one 
mpts to compare the results of various writers. The one which we 
Dt in the following is that favored by the majority of more than fifty 
rences 4 which have been consulted. A possible advantage of it lies 
he fact that our angles a and (3 become the polar angles, <f> and d, 
ectively, in spherical polar coordinates. Moreover, a rotation about 
OX -axis toward the OF-axis, as is required in the second step of our 
•edure, seems to be a natural operation. This step does, however, 
Dduce additional imaginary factors into the Cayley-Klein parameters 
the representations of the three-dimensional group (see sec,. 15.15). 
bher complications also result when one compares the wave-functions 
le asymmetric top in quantum mechanics with those of its limiting case, 
symmetric top. These latter objections are removed if the second 
tion is made about the OF-axis, instead of the 0 .Y-axis, as is done by 
dtaker (loc. cit.) and by Wigner. 5 Note, however, that Wigner has 
l a left-handed coordinate system. 

Let us return to the problem of describing the Eulerian angles, which 
ihow according to our definition in Fig. 1. Perhaps a clearer conception 
he relations involved may be obtained from the cross-section diagrams 
'ig. 2, which give the planes XOY, ZOZ r , and X r OY r and which show, 
parentheses, the axes perpendicular to the plane of the page. It will 
een that the axis OK, called the line of nodes, is the intersection of the 
F and X f OY ' planes. The axis OL is perpendicular to OK in the XOY 
ie, and OM is perpendicular to OK in the X'OY' plane. Study of 
2 will show that OXYZ may be superimposed on OX'Y r Z f by the 
>wing rotations, provided that they are performed in the order given 

1 The Cayley-Klein parameters are related to the Pauli spin matrices used in quantum 
lanics, as will be .shown in sec. 15.15. 

It agrees with that chosen by Goldstein (loc. cit.), who has also commented on the 
icting definitions of the Euler angles. In his notation, our symbols are: a = <£, 
fly y = if. It should he noted that our present, equations differ from those in the 
odition of this hook, since we inadvertently used a left-handed coordinate system 


Fig. 9-2 


and always in a counterclockwise direction: (1) rotate about OZ 
angle a; (2) rotate through ft about OK (OK and OZ are now i( 
because of the first rotation) which will bring OZ into coincidence wi 
(3) rotate about OZ f by y which then brings OX to OX f and OY 
Relations between OXYZ and OX f Y f Z f may be found most sin 
matrix methods (see Chapter 10). Suppose a vector x in the spa 
system becomes x 1 in the body-fixed system; then the matrix whi 
nects the two vectors is y), where 

x = R(a t fi, y)x 

and 

Q — E> UA U J? / 



. y or a m piace oi <p. me remaining matrix is similar m torm out re- 
nged to represent a rotation about the OX-axis. 

When the matrix product is evaluated, the result is that given in 
le 1. It should be interpreted in a manner similar to that of Table 4-1. 


TABLE 1 



OX 

OY 

OZ 

ox' 

cos a cos y 

sin cl cos 7 

sin 0 sin y 


— sin ct cos & sin y 

■{-cos a cos 0 sin y 


OY' 

— cos a sin 7 

— sin a sin y 

sin 0 cos 7 


— sin a cos (3 cos y 

4-cos a COS 0 COS 7 


OZ' 

sin a sin 0 

— cos a sin 0 

cos 0 


n order to obtain the angular velocity in terms of the Euler angles, it 
mvenient to use the body-fixed system, OX'Y'Z' , with components 
and t along OZ, OK, and OZ' , respectively. Since a is parallel to the 
e-fixed axis, OZ, its components are given by the last column of Table 1. 
components of /3, which is parallel to OK, may be found from the first 
nan of the matrix R z ( y). Finally, since y is parallel to OZ' , its only 
ponent is y. Collecting these results, we have 

ca x ' = sin ft sin ya + cos y|(3 

(Ay = sin p cos yet — sin y0 (9-24) 

(A z = cos pa +7 

die three components of angular velocity along OX', OY' , and 0Z\ 
irms of the Euierian angles, the kinetic energy of a rotating symmetric 
(A = B), which we shall need later, is seen from eq. (23) to be 

T — 2 " [A (3^ -f- Aa ^ sin 2 (3 C(y -f- a cos /3)~] (9—25) 

ided we choose OX' , OY' , and OZ' to coincide with the principal axes 
ertia of the top, for then the products of inertia D , E, and F all vanish. 
.6. Absolute and Relative Velocity. — We now return to a more 
ral consideration of the motion of a rigid body. Suppose a point P in 
located, relative to OXYZ by the vector r () and relative to O'X'Y'Z' 
he vector r. Let the instantaneous position of the origin of O'X'Y'Z' 
measured relative to OXYZ by r', where the prime here and in the re- 
ider of this chapter never means differentiation. Then the absolute 
ion of P is given by 


r 0 - r' + r 


(9-26 ) 


MECHANICS OF MOLECULES 


290 


l its absolute velocity by 

v 0 = v ; + v" (9-27) 

3 re V — dt f /dt measures the velocity of the origin of O^X^Y^Z* relative 
1XYZ and v" is the velocity of the point in the moving system. Now 
pose that the latter system is rotating with constant angular velocity of 
adians per second; then the point P has a linear velocity m X r in 
lition to its translational velocity v relative to O'X'Y'Z \ Its com- 
Lents are v x = drjdt = r z ] v y = f y ; v z = f z . Thus, 

v 0 = v' + © X r + f (9-28) 


is important to have a clear understanding of the separate terms in 
). The absolute velocity of the point is v 0 ; v is the apparent velocity 
3 measured by an observer in the system O f X f Y f Z ! who does not know 
t his coordinate axes are rotating, while o> X r is the absolute velocity 
ch the terminus of r must have in order to maintain its position in the 
ving body. The last velocity is often called the velocity of following . 
he point P is rigidly attached to the moving system, v = 0; if the mov- 
system and the fixed system have coincident origins, v' = 0. 

9.7. Motion of a Molecule. — In a molecule, we may consider the elec- 
is and nuclei as bound together in a rigid framework which moves 
3 ugh space in translational motion and which rotates around its center 
gravity. Both of these types of motion are included in the equations 
jady given. One further motion is needed, however, for the nuclei 
cute oscillations around an equilibrium position. In order to allow for 
5 vibrational motion, let r* be the instantaneous position vector of the 
l particle and a*, p* be the equilibrium and displacement vectors, respec- 
dy, so that 


le 


r t = a i + pi 
r 0 t = r ; + U 


(9-29) 


(9-30) 


he instantaneous position of the point relative to OXYZ as shown in 
. 3 

Then from (28) 


v 0 i = v' + (fl> X Ti) + Vi 


(9-31) 


291 


MOTION OF A MOLECULE 


9.7 


The reason for writing the last two terms of (32) as given comes from 
eq. (4-18) since 

v' • (ct X r) = r • (v' x tt); v • (® x r) = » • (r X v) 



Six further relations are needed to define the rotating coordinate system. 
These 6 may conveniently be taken as 

2>,v,- = 0 (9-33) 

X v f = 0 (9-34) 

The first three of these equations locate the origin of O'X'Y'Z' at the 
center of gravity of the system, for that point is given by 

- Jjrnfi 

T.rn,i 

and if r = 0, ]£m,r » = 0 and Lm s Vj = 0. The second condition, eq. (34), 
states that there is no angular momentum relative to O'X'Y'Z', when all 
particles occupy their equilibrium positions, i.e., when every r,- = a,-. 

Using (29), (33), and (34), eq. (32) becomes 

2T = v'Tw>» + L^jvf + X r<) • (© X r») + 2» • 2(»hp< X ▼<) 

- 2 (T t + T v + T r + t) (9-35; 

Inspection of (35) shows that the kinetic energy is a sum of four terms which 
mav he intemreted in order as due to the translational motion of the mole 


9.8 


MECHANICS OF MOLECULES 


292 


about an equilibrium position (T v ) ; the rotation of the molecule as a rigid 
body about its center of gravity (T r ); interaction between vibration and 
rotation (T int ). 

9.8. The Kinetic Energy of a Molecule. — It is necessary to obtain (35) 
in explicit form before further calculations can be made. As shown previ- 
ously T r becomes equal to (23), but it must be remembered that A, B, 
- • F are instantaneous moments and products of inertia relative to the 
moving axes. They are not constants but functions of the position of the 
atoms and they change as the molecule vibrates. 

In discussing the terms T v and T int , it is convenient to use normal coordi- 
nates (see sec. 10.17). Suppose p * has components fe/Vm,-, 7 n/Vm*-, 
where, 

& = HUkQk 

Vi - JL^ikQk (9-36) 

ft == '^Lift'ikQk 


and Ukt m-ik, n-ik are constant coefficients such that 

j ^ 3 “ & ij ) 'llLJft'kiWlkj ~ " 3 $ij> ~ 

k k k 

Then, 

= jL,(ki + Vi + 11) = 2 ZQl (9-37) 

Moreover, 

Z>t(pt X Vi)z = Hivitz - tiVi) = HXkQt 

2X’(p* X v<)» = - *ii) - 2^*0* (9-38) 

X m t(pi X Vi)z = £ (&5t — = ILZkQk 

where, 

-X* = Hinikmn - rn ik nu)Qi 

i,l 

Yjc = TfiUknu — n ik lii)Qi (9-39) 

~ 2* lik'W'iOQl 

Collecting terms, (35) finally appears as 

2T* = v' 2 2>; 

2 r„ = eq! 

27V = .A to* + Sc<jy + Ccjj — 2Du x ca v — 2E<i> z u v — 2Fu±u>. 


293 


THE HAMILTONIAN FORM OF THE KINETIC ENERGY 


9 


9.9. The H am iltonian Form of the Kinetic Energy. — In order to obta 
the Hamiltonian form of (40), we must change from angular velocities 
angular momenta. From (17) or (22) we see that the components 
total angular momentum are 

dT 

p x = - — = Aoi x — Do) y — F(x> z + Ji,XkQk 

ow x 

dT 

Py ~ ~ = —Dcc x + Bo) y — Eco z + 22 Y irQh (9-4 

OC Oy 

dT 

Pz — - = —F<ji x — E(ji y + C(o z + 22 ZkQk 

Oca) z 

Similarly, the momenta conjugate to Q k are 
dT 

Pk — ~T~ = Qk + Xic^x + Yya y + ZjcO0 z (9-4 

dQk 

Solving this equation for Q k and substituting in (41) gives 

Px = ZlOJ x — DtCy — Fu z + '22Xlc(pi c — Xlc^x ~ YkWy — ZkO>z) (9-4 

with similar expressions for P y and P z . The following abbreviations m; 
be used to simplify the final results. 

A' - A — HXh D' = D + 

B' = B - T.YI) E' = E + T.Y k Z k 

C' - C - T.Zb F' = F + T,Z k X k 

In terms of them, we may write 

Px = A f co x — D f Wy — F'o) z + ^XkPk 
P y = — D' o) x + B'ojy — E f co z + J2YkPt 
Pz = ~~Fu x — E'ooy + C' o> z + ZZkPt 

If we also write 

Px = ItLXkVk) Vy — 22 YkPk ] Vz == J^Zkpk 

(43a) may be further simplified to read 

Px = Px + A'(jO X — D'oty — F f u z 

Py ~ Pv B f u x + B f (x> v — E r oj 2 

P Z = Pz ~ F f (j) X — E'ajy + C* 0) Z 
The mi an titles n... r>~ arise from vibration alone as mav be seen frr 


(9-4 

(9-43 

(9-4 

(9-4 


8.1U 


MECHANICS OF MOLECULES 


294 

Adding together ail the terms of (40) and using (41), (42) and (45), 
we obtain 

2 T * 2T t + (P X ~ p z )o) X + (Py - Py)a>y 

+ (P. - P*)«* + Ep! (9H17) 

Finally, we find by solving (46) for the «’s that 

“»■ = (f, j = y, 2) 

W-Hi;9j-(Pi-Ps) (9-48) 

With the use of these variables, eq. (47) takes the more elegant form 

2T = 2 T t + + ZP* (9-49) 

Explicitly the ju’s are: 



B'C' 

- E' 2 


A'C' 

_ p' 2 

U-xx 


A 

V-vv = 

A 


A'B' 

- D ' 2 


C'D' 

+ E'F' 

Iftzz 

~ 

A 

P'xy ~ 


A 


D'E' 

+ B'F' 


A'E' + D'F‘ 


~ 

A 

) P-yz 


A 



A' 

-D' 

—F' 



A = 

—D' 

B' 

—E' 




—F' 

— E' 

C' 



9.10. The Vibrational Energy of a Molecule. 7 — The first term in (47), 
the translational energy, is of little interest in physical problems. We 
shall have no more to say about it. The only other term of that equation 
which can be treated further by classical mechanics is the last one, corre- 
sponding to the vibrational energy of the molecule. We first consider the 
potential energy of the system due to the vibration of the particles. It will 
be some function of the mutual positions of the nuclei and it is most con- 
venient to specify these in terms of the mass-adjusted components of a 
displacement vector. We formerly took these as 

(see eq. 36), 3 n in number. Following convention we now use q 2y * • •, 
Qsn for the same coordinates. If the system is placed originally in the 
equilibrium configuration (all q { = 0) and if the particles have very small 

7 This section as well as secs. 9.11 and 9.12 makes use of some of the results of Chap- 
ter 10. It should be omitted or postponed by readers not familiar with orthogonal 
transformations. The authors suggest that the reader, rather than endeavor to under- 



295 


THE VIBRATIONAL ENERGY OF A MOLECULE 


910 


initial velocities, we assume that they will never depart to any large dis- 
tance from that configuration, nor will they ever acquire large velocities. 
Under these conditions, we may develop the potential energy V by Taylor's 
theorem in terms of ascending powers of the g*. 

V(qi ’ g2r • ’’ g3n) = y ° + ? © * + ( 4 © (9 " 61) 

The constant term 7 0 which is independent of the g t - can be omitted since 
it has no effect on the equations of motion of the system. The term linear 
in the g* must also vanish since dV/dqi — 0 is the condition for equilibrium. 
Finally if we omit all terms beyond the third, we obtain as an approxima- 
tion to the vibrational potential energy 

2V = Zbimj (9-52) 

where 6*/ = (d 2 7/dg;dgy). From (37) we have, in terms of the coordi- 
nates qi 

2 T = Zg? (9-53) 

where T is now written for the former T v . 

If we now subject both T and V to an orthogonal transformation (see 
sec. 10.17), we obtain 

2 T = ZQb 27 = J2\ k Ql (9-54) 

where the normal coordinates Q k are related to the q’ s by 

qi = ' ( 9 - 55 ) 

The constants X& are the 3rt eigenvalues found from the characteristic 
equation 

I - bn | = 0 (9-56) 

and is the matrix formed from the eigenvectors. 

Knowing T and V we may obtain the motion of the molecule by solving 
Lagrange's equations (8). They appear as 



or 

Qk = -X k Q k (9-57) 

Three different possibilities arise: (a) X* > 0; (b) A& = 0; (c) A* < 0. 


This is the equation of simple harmonic motion with two constants of inte- 
gration; Ak is the amplitude and 6^, the phase constant. Eq. (55) now 
reads 

Qi = Tai k A k cos (VXfc^ + 8k) (9-59) 

k 

If all of the Ak are zero except one, say Ai, then all of the nuclei are acting 
as simple harmonic oscillators with a frequency of 

Vx^ 

1,1 =17 

about their equilibrium position. Each nucleus has the same phase 
constant and reaches its equilibrium position at the same time. The ampli- 
tudes will vary because of the factor aa- Such a motion is called a normal 
mode of vibration. Actually the situation is much more complex, for many 
of the Ak will be different from zero. Thus the motion of the nuclei con- 
sists of a superposition of all the normal modes of vibration, each with its 
own frequency VA/t/2 tt and amplitude. 

It frequently happens that some of the A* will be equal to each other 
in pairs or threes. This phenomenon, called double or triple degeneracy, 8 
means that two or three equivalent motions of the molecule have the same 
frequency and differ only with respect to their orientation in space. The 
phase factors and amplitudes must be evaluated from the initial positions 
and velocities of the n nuclei. We show in the next section how the normal 
modes and coordinates may be determined for a specific example, 
b. A k = 0. The solution of (57) is 

Qk — Akt + 8k 

hence the resulting motion is not a vibration. The nuclei will not oscillate 
about the equilibrium position but will continually move away. Since the 
whole treatment of the problem is based upon small oscillations from the 
equilibrium position, we are no longer justified in this case in omitting 
higher terms in the potential energy, and the method fails. Actually, it 
will be found that six of the A* vanish in the molecular problem (five if the 
equilibrium arrangement of the nuclei is linear). Three of these zero 
frequencies may be associated with translation of the molecule along three 
mutually perpendicular axes and the remaining three with rotation about 
the same axes. When it is desired, the zero frequencies may be removed 
from the problem before solving (56). This is done by reducing the 
number of coordinates from 3 n to 3 w - 6, the equations of conservation 
8 Wu, Ta-You, “ Vibrational Spectra and Structure of Polyatomic Molecules," 
Second Edition, Edwards Brothers, Inc., Ann Arbor, 1946; Mathieu, Jean-Paul, “ Spec- 

t.rAS Ha Vihrn+.inn At. Rvm trip. Hnc lVT^l nnlao at- ^,‘0+0.,,. n f 


297 


VIBRATIONS OF A LINEAR TRIATOMIC MOLECULE 


9.11 


of linear and angular momentum (eqs. 33 and 34) being used for that 
purpose. 

c. A k < 0. The solution becomes imaginary and again does not corre- 
spond to a vibration. This case never occurs if the potential energy is a 
positive definite quadratic form (see sec. 10.12) which is always true in the 
molecular problem. 

9.11. Vibrations of a Linear Triatomic Molecule. — As an example of 
the preceding theory, we consider a linear symmetrical triatomic molecule 
XY 2 such as carbon dioxide. Let the central atom X have a mass ra 2 
and the two end particles have mass m\. Let the equilibrium positions 
be x? and x 3 for Y and x 2 for X. In order to simplify the problem, we arbi- 
trarily assume that the only motion which the nuclei can make is along the 
line adjoining them, hence the displaced positions are x t - = z® + &c t -. If 
we now take the potential energy' 9 as proportional to the square of the rela- 
tive displacements of the particles, in accordance with eq. (51) we have 

2V = k{ (dxi — dx 2 ) 2 + (<5x 2 — <5a; 3 ) 2 } (9-60) 

and 

2 T = mi (6x1 + &r 3 ) + m 2 bx \ 

In terms of mass adjusted coordinates qi = Vm^x* 

2T = 



Comparison with (52), shows us that 

bn = k/nii) b\ 2 = b 2i = — fc/v / mim 2 ; b 13) = 6 3i = 0 

b 22 *= 2 k/m 2 ' y 6 23 = 6 32 = -/c/Vm 1 m 2 ; 5 33 = k/m x 

When these values are substituted in (56) and the determinantal equation 
is solved we obtain 

Ax = k/mi; X 2 = kfjL ; X 3 = 0 

M = (2mx + rn 2 )/mim 2 (9-62) 

In order to find the coefficients of eq. (55) which relate the to the 
normal coordinates Qi it is necessary to find the transformation which 
reduces T and V simultaneously to a sum of squares (see sec. 10.17). 
According to sec. 10.15 the matrix effecting this transformation has as its 
columns the eigenvectors of the matrix [&»,•], and these eigenvectors are the 



9.11 


MECHANICS OF MOLECULES 


solutions (xi,X2,x 3 ) of the equations 

Hl bijXj = \xi 

corresponding to the three eigenvalues A t * already found. Simple 
tation yields for these eigenvectors 

Jc 

[ — *3, 0,23] X = 


Till 



They are already orthogonal; when x 3 is also fixed by normalizatii 
by equating the sum of the squares of the components of each ve 
unity, they may be compounded to give 

— 1/V 2 l/V2^Wi 

ocik = 0 — 2/V2/xm 2 l/V/xmx 

1/V2 l/V / 2/im 1 1/V^_ 

We can now find the normal modes of vibration from (59). ' 

A*: = Xi, we see that the two end atoms move in opposite direction 
the central atom is stationary. The other normal modes are found 



Fig. 9-4 

same way. They are shown in Fig. 4. It will be observed that 1 
zero frequency 10 X 3 , the motion is translational, since Xi = mj" 1 
mf 1/2 ai3(A3£ + 6 3 ) — (2m! + m 2 )‘” l/2 (A3^ + 63), and x 2l x% also 

f.Vnc 


iroccinn 


299 


QUANTUM MECHANICAL HAMILTONIAN 


9.12 


The treatment of this molecule is not complete because of the artificial 
assumption that the motion is only along the line of nuclei. For a com- 
plete treatment the reader is referred to Wu, Mathieu, or Herzberg 
(loc. cit. ). 

9.12. Quantum Mechanical Hamiltonian. — Lack of space forbids the 
transcription of the results thus far obtained into the quantum mechanical 
language of Chapter 11. To provide a general view, however, we shall 
append here a few comments indicating the line of attack to be taken on the 
problem of the polyatomic molecule from the quantum point of view. The 
material of this section is not needed in other parts of this book. The 
expression for the classical kinetic energy found in (49) contains momenta 
p a (a = x , y, z) defined in eq. (45) and p k defined in eq. (42). Both of 
these are conjugate to the normal coordinates Q k . On the other hand the 
momenta P a of (43) are not conjugate to Q k . In order to obtain a suitable 
expression for use in quantum mechanical calculations, all of the coordi- 
nates and momenta must be conjugate to each other. It is true that the 
P a which are functions of the angular velocities could be written in terms 
of some set of coordinates such as the Eulerian angles and then the Eulerian 
angles a, 0 , 7 , the normal coordinates and the conjugate momenta p a , p 0 , p y , 
p a and p k would be appropriate. The coordinates used in (49) may be 
retained, however, as shown by several authors. The correct quantum 
mechanical Hamiltonian 11 is 

H = V /2 ZC Pa - Va)Ha bt r ll \P b ~ Pb) 

a,b 

+ |m 1 / 2 Ep^- 1 / 2 P* + V (9-64) 

k 

where a, b denote x , y } z and ju is the determinant of ju a & (cf. eq. 50). This 
expression may be simplified by noting that P a commutes with p k and that 
the pab are functions only of the Q&. We thus obtain 

H = hTVabPaPb ~ ~Lh a Pa + ^ ^ab^Pb 

a,b 

+ V /: W _1/2 P* + y (9-65) 

where 

h a = + PbfJ-ab + Ma6M 1/2 (p&M~ 1/2 ) (9~66) 

b 

and pb does not commute with the ix s. 

For the sake of greater generality, we no longer need confine ourselves 
to the potential energy expression previously used but write the most 
general function consistent with the symmetry of the molecule 

v = Vr, 4- V, 4- Vo 4- • . . 


9.12 


MECHANICS OF MOLECULES 


The first term, V 0 , is identical with that given in (52) or (54 
geneous in the third powers of the normal coordinates a: 
products; V 2 is of the fourth power, etc. When the ] 
expanded, it is found that it can be divided into terms of < 
as follows: 

H = Ho + Hi + H 2 + • • • 

The explicit form of H 0 is 

f p2 p2 p2> 

2H 0 = £ + £ + £ +Ep1 + F 0 

lA-0 i>0 UoJ 

where A 0 , B 0 , C 0 are the equilibrium moments of inertia, 
this represents the sum of the Hamiltonians of a rigid rots 
monic oscillator; hence this part of H may be treated < 
methods of quantum mechanics as outlined in Chapter 1! 

Even to this order of approximation the details are 
vibrational part of the Hamiltonian for an n-atomic rnolec 
solution of a secular determinant like (56) with (3 n — 6) ro 
Utilization of molecular symmetry, however, makes it p< 
this determinant. 12 Moreover, if the potential energy t 
pressed in coordinates parallel or perpendicular to chemical 
valence-bond coordinates), then vector and matrix methoc 
Wilson and others, 13 prove to be powerful tools for evei 
molecules. 

Still further difficulties arise if higher terms in the 1 
included but these are important if interactions between th 
vibrational energies are considered. Such interactions are 
experimentally and higher order rotational effects are 
especially in the microwave region. 14 A suitable perturb 
for such cases, which involves contact transformations, hag 
by Nielsen 15 for the general n-atomic molecule and has 
many special molecules. 

12 See Chapter 15, or Herzberg, loc. cit. 

13 Wilson, E. B., Jr., Decius, J. C., and Cross, P. C., “ Molecula 
Theory of Infrared and Raman Vibrational Spectra," McGraw-B 
New York, 1955. 

14 Gordy, W., Smith, W. V., and Trambarulo, R. F., “ Microwi 


CHAPTER 10 


MATRICES AND MATRIX ALGEBRA 

In ordinary arithmetic, attention is focused upon single numbers 
These numbers may be combined by various operations, such as addition 
subtraction, multiplication and so on, to yield new numbers. In mam 
branches of algebra, the student is forced to confer interest, not upoi 
single numbers, but on collections of numbers (or functions). These col 
lections can be simple sequences like ai, a 2) ■ * •, a n , in which the order o 
the individuals may, or may not be of importance. A vector is an exampb 
of this kind. When such a sequence is written down, no understanding 
prevails that the numbers are to be combined in a certain way; it is th< 
collection itself which matters. Meaning is imparted to the collection b] 
specifying how it is to be combined with other collections. 

Besides simple sequences, collections of two-dimensional character ar< 
often objects of interest in mathematics, and recently in physics anc 
chemistry. They may have a great variety of forms; they may be tri 
angular , as 

bi b 2 

Ci C 2 C3 

or rectangular , as 

o> x a 2 «3 • • * a n 
b\ b 2 63 • * • h n 


ei e 2 e 3 • • • e n 

or quadratic , as 


Ox 

a 2 

(*3 

a i 

b 1 

b 2 

bs 

64 

c 1 

c 2 

C3 

c 4 

di 

d 2 

dz 

d 4 


Of these, the rectangular and quadratic ones are of greatest value 

r,: u. a , * 


10.2 


MATRICES AND MATRIX ALGEBRA 


determinants and matrices. This is usually indicated by en« 
in bars or brackets of different form, bars being frequently i 
nants, brackets for matrices. It is also convenient to us 
for the individuals of a collection, and to distinguish the 
simple or* linear collection by single subscripts, those of a 1 
collection by two subscripts. 

10 . 1 . Arrays. — A collection of real or complex quanti 
array if it can be displayed in an orderly table of rows anc 
individual members of the array are its elements. Each is 
pair of indices , the first one referring to the row and the se 
column in which the element is located. For example, 1 
will appear in the p- th row and the g-th column. If the n 
equals the number of columns, the array is said to be squa 
and of order n; if there are n rows and m columns (n ^ 
rectangular and of order (n X m). 

10 . 2 . Determinants. — The most familiar type of arra; 
nant, 1 which always has an equal number of rows and colu 
written in one of the forms : 


det A 


An 

A 12 

A 13 

• • - A 

A 21 

A 22 

A 23 

■■■ A 

A31 

A32 

A 33 

■ ■ A 

Anl 

A n 2 

A n 3 

■■■ A, 


The value of the determinant is obtained by the folio 1 
First, a total of nl products is formed by taking one elemer 
and column. Each product is then arranged so that the 
of the elements are in their natural order 1, 2, • • *, n. 
been done, it will be found that the products may be sep 
and odd classes each containing n !/2 terms, as follows. I: 
an even number of interchanges of the elements is requi] 
second subscripts into their natural order while in the oc 
number of interchanges is needed. For example, A12A23A 
class while A10A21A33 is in the odd class. If a plus sign 
even products and a minus sign to the odd ones, the alget 
n\ terms, by definition, is the value of the determinant, 
write 

| -A | = JZ(~-l) h Ai Tl A2r 2 * * * A n r n 


1 H-oferonnos will h<» found nt tho ond of this r.hnnfor Tho most 


303 MULTIPLICATION AND DIFFERENTIATION OF DETERMINANTS 10 , 


where the summation is made over all permutations of r x r 2 , • * •, r n , and i 
is the number of interchanges required to restore the natural order. 

The following properties are direct consequences 2 of this definition. Ii 
each statement, the word row may be replaced by the word column am 
the reverse. 

1. The value of a determinant vanishes, | A | =0, when: 

a. All elements of a row are zero. 

b. All elements of one row are identical with, or multiples of, th 
corresponding elements of another row. 

2. The value of a determinant is unchanged, if : 

a. Rows and columns are interchanged. 

b. A linear combination of any number of rows is added to any on 

n 

row; i.e., if A {j is replaced by £ c^A j - 1.2.** *, n, provided the c k an 

k = i 

fixed numbers. 

3. The value of a determinant changes sign if two rows are interchanged 

4. If each element in any one row appears as the sum (or difference) o: 
two or more quantities, the determinant may be written as a sum (or differ 
ence) of two or more determinants of the same order. Thus if the ordei 
is two 


An db Bu A 12 dz 


An 

A 12 

Bu 

B12 


= 


dz 



A 2 1 A 22 


A21 

A 22 

A21 

A22 


5. If all elements of a row are multiplied by a constant factor, the valu« 
of the determinant is multiplied by the same factor. 

10.3. Minors and Cofactors. — The complementary minor of an ele 
ment A pq is the determinant obtained by striking out the row and columi 
in which A pq appears. The cofactor of A pq is (— l) p ’ 1 " 9 times its comple 
mentary minor. It will be indicated by A pq . It follows from eq. (1) tha 

\A \ = £ A ik A ik = £ A ki A ki ; (k = 1, 2, • • •, n) (10-2; 

i=l 1=1 

However, 

£ A^A 17 = £ A ki A ji = 0 ; j^k (10-3; 

i=i i=i 

for comparison with (2) shows that these equations are the expansion of s 
determinant whose k-th and j-th columns are identical with the k-th columi 
of | A |, and according to property 1-b of sec. 10.2, if two columns an 

I A I — n TTr* (0\ 'f.Vua T,nnr\\ n np rlpnol n'rwnpnni io /•'rkmmrvnL 


10.4 


MATRICES AND MATRIX ALGEBRA 


304 


used for numerical evaluation of determinants, but if their order is larger 
than three or four, the number of terms and the labor involved is so great 
that other procedures are to be preferred. We describe one in sec. 13.27. 

10.4. Multiplication and Differentiation of Detenninaiits. — If | A j 
and | B j are determinants of order n, the product j C | 

\A\\B\ = | C | 

is a determinant of the same order. Its elements are given by one of the 
four equivalent (though not equal!) expressions 

Cij = L A ik B kj or £ A ik B jk or £ A ki B kj or £ A ki B jk (10-4) 

k= 1 k~l *=- 1 

The proof for determinants of order two follows. Using the first form 
of (4) we obtain 

^ n^ii + A12B21 A11B12 + A12B22 

A21B11 + A22B21 A21B12 + A22B22 



but according to property 4 of sec. 10.2, the product may also be written 


An#n 

AnB\ 2 

+ 

AnR n 

A11S12 

A 21 B 11 

A 2 \Bi2 

A22R21 

A 22 B 22 

Ai 2 B 2i 

Ai 2 B 22 

+ 

A12R21 

A 12^22 

A 2 \Bn 

A 2 iBi 2 

A 22 B 2 i 

A 22 B 2 2 


The first and last terms of this sum vanish, for if the constant factor A u A 2 i 
is removed from the first determinant its first row is identical with its 
second row. Removal of the constant term Ai 2 A 22 from the last determi- 
nant leaves it with two identical columns. Constant factors may also be 
removed from the remaining determinants but they do not vanish. The 
result is 


C 


Ai\A 22 


Bn B 12 
B 2 i B 2 2 


+ Ai 2 A 2 i 


$21 22 

Bn B 12 


Referring to property 3 of sec. 10.2 we see that this becomes 


C 


(AnA 22 A12A21) 


Bn B 12 
B 21 B 22 


Finally we note that (AnA 22 — A 12 A 21 ) is just the Laplace development 
of | A | so that we have shown the equivalence of the determinant | C | 
with the product | A \ [ B |. The proof with the other forms of (4) is 



PRELIMINARY REMARKS ON MATRICES 


10.6 


1 (25), the partial derivative of a determinant with respect to an 
equals the cofactor : 

a 1 A 

dAik 

Pre liminar y Remarks on Matrices. — If two or more arrays may 
ixied in a certain way described in sec. 10.6, they are called ma- 
We indicate them by 



An 

A 12 

A is 

9 • * Aim 


^21 

A 22 

A 23 

9 * * A 2m 

II 

II 

^31 

Az2 

A 33 

999 A 3m. 


_A n \ 

A n 2 

A n3 

A nm 


ieterminants, matrices may be square or rectangular. Matrices 
:»e order 4 will not be discussed here. When a matrix contains only 
or column, it is called a vector. For a row vector , we will write 

M = [xu x 2 , x 3 , • • •, x n ] (10-5a) 

to save space, we write a column vector as 

{x| = {x u x 2 , x 3 , ■ ■ ■, X n | (10-5b) 

i its matrix form would be 

^2 

^3 


L*«J 

letter u, v, • * % written without brace or bracket always means a 

vector. Matrices with two or more rows or columns will be indi- 
r capital letters. 

elements of a square matrix A may be written and evaluated as a 
riant. If | A | =0, the corresponding matrix A is called singular. 
iterrainants do not exist for rectangular (non-quadratic) arrays, all 
ilnr matrices, by definition, are singular. Suppose we formed 
ints of all possible orders by taking successively 1, 2, * • *, n rows 
:mns of A . If at least one determinant of order r does not vanish 



and all determinants of order greater than r do vanish, A is said to be of 
rank r. Thus if A is singular and of order n, r < n; if non-singular, 
r = n. 

Problem. Show that the rank of the following matrix is twos 

“111 1 “ 

2 2 3 -1 

0 0 1-3 
_3 3 5 — 3 _ 

10.6. Combination. of Matrices. — Two matrices A and B are equal if 
and only if they are identical. If A = B, then A pq = B pq for every p 
and q. 

The addition or subtraction of two matrices of order n gives a new 
matrix of the same order according to the following rule. If A ± B = C, 
then C pq — A vq db B pq . Addition and subtraction are both commutative 
and associative. 

A±B = ±B + A; (A±B)±C = A + (±B±C) 

Multiplication of a matrix by a scalar quantity a is defined by 
a A = a[A{j] = [aAfj] = Aa 

Two matrices A and B may be multiplied together in the order AB 
only when the number of columns in A equals the number of rows in B. 
Under this condition, the matrices are said to be conformable. If A is of 
order (n X h), B of order (h X nz), the product C is of order (n X m). 
Its elements are given by 

h 

= 'JL^paBsq] (p = 1 , 2 , • • *, n\ q = 1 , 2 , • • •, m) 

AB = [Cd = C 

This rule for multiplying matrices is not as arbitrary as it might seem; it is 
suggested by the properties of linear transformations and the reason for 
defining it in this way will be given in sec. 10.10. We note at this point, 
however, that the law of matrix multiplication is identical with the first 
form of eq. (4) which defines the multiplication of determinants. Hence 
det (AB) = (det A) * (del B) if A, B are square, but det (A + B) 9^ det A + 
det B. In general, AB 9 ^ BA, but when the order of multiplication is of no 
importance, so that AB = BA, the two matrices are said to commute or to 
be permutable. The ordinary laws regarding distribution and association 


37 


SPECIAL MATRICES 


10.7 


Provided A, B, x and y are properly conformable 
A{x] = {y}; [x]A = [y] 

[x] { y } = a scalar; { x}[y] = B (10-7) 

i the last case, B is a square matrix which has the same number of rows 
3 {xj (or columns as [y]). Its rows (or columns) are proportional to each 
ther. 

A given matrix may be divided into smaller matrices, the result being a 
artitioned matrix. For example, a square matrix of order three may be 
ivided into four submatrices as shown. 


here 



~A U 

a 12 i 

A 13 

A = 

A 2 I 

A 22 i 

A 23 


-A31 

co 

to 

A 33 , 


„ [An 

n u. 

fl 21 = [ Asi 


A 12 
A 22 J 
A32 ] , 


Gl2 

a 22 


dll 

@12 

_ a 2l 

@22 

A 13 

_A 2 3 _ 


A 33 



B is a similar matrix and is similarly partitioned, then each submatrix 
U and bij may be treated as a single element so that 

AB = c = <2ll ^ n a - 12 ^2i a i\b\2 + ^ 12 ^ 22 ! 

L a 21^11 + ^22^21 fl 21^12 + &22&22J 

inally, the elements of C are completely evaluated l)y the usual rules for 
Latrix multiplication and addition. 

If A = [Afcj] is a square matrix of order rn and B — [B pq ] is a square 
Latrix of order n, then the direct product 

A X B = [AfcjSpq] 

a square matrix of order mn. The index pairs (k,p) and (j,q) refer to 
le row and column, respectively. A suitable convention for arranging 
le rows and columns consists in taking these pairs in such a way that 
',q) precedes (j',q) if j < j q < q or if j = /, q < q (dictionary order). 

' A, C are of order m and B, F of order n then 

(A X B) (C X F) = AC X BF 


a matrix of order mn. The direct product of matrices has of course 
> thing to do with the cross product of vectors, for which the same symbol, 
is used. 


Problem a. Prove eq. (7). 

Problem b. Prove that (A X B)(C X F) = AC X BF. 


implies that either or both of the matrices multiplied together 
matrix (cf. Problem, sec. 10.7). 

The unit matrix E has unity for elements along the “ main 
All other elements are zero. The matrix elements are c 
symbolized by the Kronecker delta (cf. sec. 3.4) 


For every matrix, 



p ^ q 

p — q 


EA = AE = A 


If all matrix elements vanish except diagonal ones, the mal 
diagonal. The general element of a diagonal matrix is thus 
Di6{j. All diagonal matrices commute with each other, for i 
are diagonal 

{DD') ik = £ DAjD'8 jk = DiD'Sik = (D'D) ik ' 

j = 1 

If a matrix A commutes with a diagonal one, D, the elements 
vanish, except those for which the diagonal matrix has equ: 
Di = Dj. The proof is as follows. Assume that AD = DA an 
Dij = DiSij. Then 

Z) AikD k 8 k j = J2 DidijcAjcj 
k k 

and 

AijDj = DiAir, Dj) = 0 

Hence, either D t — Dj or Ajj = 0. 

If all of the diagonal elements of D are different, A must be tr 
with all different elements, say A 1} A 2 , • • •, A n . It is sometime 
to write such a matrix in the form 

A = diag (A i, A 2 , * * •, A n ) 

If some of the diagonal elements of D are repeated so that its 
D = diag {D\ , D\ , Do, Do, D>, • • •) 
then the commuting matrix A will have the form 

A = diag (ai, a 2 , • • •) 

where the square matrices g z - are arranged in symmetric positio 
main diagonal and the other elements of A are zero. The 1 
submatrices will be 

A 33 A 35 

ri 43 A 44 A 45 
A 53 A 54 A 5 5 _ 

6 The main or principal diagonal is that running from the upper lof 
right of the array. 




_ r a 1 1 a 

La 2i a 


^2 


>09 


SPECIAL MATRICES 




The sum of the diagonal elements of a square matrix is called the trace 
^German “ Spur ”). 

Tr A = i An 

i = i 

rhe trace of the product of two or more matrices is independent of the 
>rder of multiplication. The proof is simple. 

Tr AB = I \{AB)u = = Tr BA 

i i j 

[f A X B = C, Tr C = Tr A • Tr B. 

The transposed matrix to ^4, indicated by A — [Aji] is formed from A 
oy interchanging rows and columns. If -4 and 5 of (6) are transposed, A 
oecomes of order (h X n) and B of order (m X h). They may be multi- 
plied together only in the order BA and the product C is of order (m X n). 
Fhus when a matrix product is transposed, the sequence of the matrices 
forming the product must be reversed. This holds true for any number of 
•actors 

F = ABCD • ■ • X; P = X • DCBA 

The matrix A = [.4 l ] is the adjoint matrix . 6 Note that the adjoint is 
formed by first finding the cofactor A vq of the element A vq in | A | and 
then transposing the resulting matrix. From the properties of determi- 
nants, it follows that 

Al = AA = j A i B (10-8) 

bence if A is singular 

AA = AA = O (10-9) 

However, the adjoint matrix exists even when A is singular. 

When A is a non-singular square matrix, we may divide A by | A | to 
nbtain a matrix A~ l which is the reciprocal of A. Only square matrices 
have reciprocals. 

A~ l = pfr ; A A- 1 = yi- 1 ^ = E (10-10) 

I A. | 

Suppose the matrices of (6) are square and non-singular. Multiply both 
sides of the equation by BT l A~ l and then by C~ l in the order shown: 

Br l A~ l ABC~ l = B- l A- l CC~ l 

Thus C~ l — B~~ 1 A~ 1 . Reciprocation of a matrix product requires reversal 
of the order of the factors as in the case of the transposed matrix product. 
The rule holds for any number of factors. 

If the elements of A are complex numbers, the complex conjugate of A is 


defined as A* * [ALJ]. Unlike the preceding case, if F = ABC • • • X, 
F* = A*B*C* - X*. 

The matrix formed by taking the complex conjugate of all the ele- 
ments and then transposing the matrix is called the associate matrix J 
A' = (£*) = (A)* If F « ABC; F = CW. 

At this point we nave defined four important operations on a matrix A. 
These result in —A, A, A"" 1 and A*. It is important to note that each of 
these operations has the reflexive property, so that when the operation is 
performed twice, the original matrix is reproduced : 

-(-A) = A; (a) = A; (A^ 1 )” 1 - A; (A*)* - A 

By combining these operations in all possible ways, the following 16 matri- 
ces may be derived from A: = fcA, ztA~\ =fcA*, db^)” 1 , ±(A*) _1 , 

d=A f , db(A t )'“ 1 . In certain cases, A may be identical with some other 
member of this set. Such matrices have been given special names. We 
shall have occasion to discuss the properties of most of them later, but 
for convenience we list them now in Table 1. We will have no need of 
the types: A — A" 1 (involutary) and A = (A*)”" 1 . 


TABLE 1 


Relation j 

Name of A 

Matrix Elements 

A - A 

symmetric 

II 

§ 

A = -1 

skew symmetric 

App — 0, Apq — — A 

A - A * 1 

orthogonal 

cf. eq. (42) 

A = A* 

real 

Apq — Apq 

A - -A* 

pure imaginary 

A pq - iBpq) B pq real 

A *= A* 

Hermitian 

Apq — A*p 

A = -4* 

skew Hermitian 

App =0; Apq — Aqp 

4 = (^r 1 

unitary 

cf. eq. (50) 


Note that a real symmetric matrix is a special case of an Hermitian 
matrix. Suppose H — A + iB is Hermitian with both A and B real; then 
H f = A — iB; but by definition H = H f . Thus the real part is sym- 
metric and the imaginary part skew symmetric ; in other words, a real 
Hermitian matrix is also symmetric. Similarly, a real orthogonal matrix is 
unitary , for if V = A + iB is unitary then by definition U = (£/ t )“" 1 , 
UH J = E and (A - iB) (A + iB) = E. If 8 = 8=0, then AA = E 
which defines the orthogonal matrix. However, a complex symmetric 
matrix is not Hermitian nor is a complex orthogonal matrix unitary. 

Problem. Show that AB = O but BA ^ O where 


'“-6 

1 

1 

to 


0 

1 

-2 ~ 

-9 

-6 —3 

; b = 

-1 

0 

3 

3 

2 1 


2 

-3 

0 


A = 


10.8. Real Linear Vector Space. Let us consider a space of two dimcn- 
is, that is, a plane in ordinary three-dimensional space. A vector in 
; space, as we have shown in seta 1.1, is completely desen bed by its two 
iponents or by the coordinate's of its origin and terminus. It is also 
bribed by the matrix x of one column, or its transposed, the row vector 
x, the two real numbers which are its components being the two 
[ rix elements. After we have ehosen one v or tor it is possible to tind 
t her vector y in the same plane which is not a multiple of x. In fact, 

; completely independent of x. But no matter how we draw a third 
tor z, it may always he represented as 

ax | by z 

to a and h are numbers. There is nothing unique about x and y, the 
itt being that two and only two vectors are linearly independent, in two 
tensions and a third vector is linearly dependent on the other two. The 
:at ion may further lx* characterized as follows. If two vectors are 
arly independent, no relation 

ax -f by 0 

exist unless a b 0, for as we have seen a linear combination of two 
tors gives a new vector. For t he purposes of this chapter, we shall 
d more than two or three dimensions, hence we shall speak of a space of 
intensions, where n is an integer. When n is greater than three, it is, of 
rsc, impossible to visualize the situation, hut tin 1 geometric concepts ot 
in, a rv space will be used wherever convenient. 1 H i is an //-dimensional 
rdinate system will consist of n mutually perpendicular axes, a point, 

! require n coordinates for its location and a vector will be described 
means of its n components or by tin* coordinates oi its origin and 
minus. 

Suppose tin* components ot a vector in such a space are real numbers 
jv, ■ • ■, x nt then we may write the vector x as a matrix ot cither a, single 
’ or a single* column as in (5a) or (5b). 

The rcalur product of two vectors' 4 is a scalar 

xy .rq/i t .ru/.; ( ••• 1 t„//„ (H> H) 

a square* of the length of a vector is detun'd as in sec. LI 

/* XX x 2 s\ f ■ 1 •■' ! xi (10 12) 

The rector product, usually denoted by y X x, is more difficult to fonnu- 
* by matrix methods. To obtain it., wc* first, construct, Irom y the skew- 
nmet ric matrix r- 

0 i/a i/z 
Y - f/ ;i 0 -//i 



The vectors Uj, u 2 , • • u n are linearly independent i 
scalar quantities C\, c 2 , ■ • \ c n not all zero such thai 

ClUj + C 2 u 2 -f • • • + c n u n = 

The simplest way of testing vectors for linear indej 
the Gram determinant (see sec. 3.13) 


UlUi UiU 2 * * • UlU 

U 2 U! U 2 U 2 • • • U2^ 

Until U„Uo • • • Uni 

If | F | vanishes, the vectors are linearly dependen 
independent. 

When n linearly independent vectors have bee: 
n-dimensional coordinate system or basis } being ec 
coordinate axes. Any other vector v may then b 
combination of the chosen vectors u 1} u 2 , • • •, u n , 
being unique. It should be emphasized that there i 
the choice of the basis, for any n linearly independ* 
for that purpose although the most convenient chc 
of unit vectors. The latter are defined by the rela 



ex = {1,0, 0,0,- • -,0j 

e 2 = {0, 1,0,0,- • -,0} 

e 3 = { 0 , 0 , 1 , 0 ,- • -, 0 } 


e» = {0,0, 0,0,- • -,1} 


or similarly as row vectors. Clearly they are of un 
perpendicular, for 

= d{j 

In terms of the unit vectors, any vector x may be i 


x = x&i + x 2 e 2 4 + x n e 

If the origin of x is taken as coincident with the ori, 
by the e z , the components of x are the coordinates i 
It is often necessary to use a particular set o 
vectors as a basis, constructing from them a set \vh< 
ally perpendicular and of unit length. This procedi 
orthogonalization method , is effected in the following 
given vectors are u 1; u 2 , * • •, u n . Select an; 
Ui, and let v 1 = u 1; ei = Vi/i l7 where l\ is th< 


y 


i ri r\f o An f l-» 




take v 2 = u 2 — c 2i ei, choosing c 2i so that eiv 2 = e x u 2 — c 2 i%iei = 0 
which requires c 21 = e^o or v 2 = u 2 — (g]U 2 )ei. If we put e 2 = v 2 /i 2 , 
where h is the length of v 2 , we shall have eie 2 — 5 12 . Next let 
7 3 = u 3 — cai©! — c 32 e 2 , determining the constants so that giv 3 = eiti 3 — 
c 3 i = 0 and S 2 v 3 = e 2 u 3 — c 32 = 0, which means that c 31 = eiU 3 and 
c 32 = g 2 u 3 . Finally let e 3 = v 3 /Z 3 . Continuing in this way, we may 
construct the complete set of n unit vectors with 

@n-fl == j V/i+l == ^-n+1 2* (S&U n _j_i )©& 

6 n _{_i k = 1 

Problem a. Consider the columns of the matrix of Problem, see. 10.5, as the com- 
ponents of four vectors. Test them for linear dependence. 

Problem b. Prove that F 3 = — [y](y|F. This relation is known as the “Cayley 
identity.’' 

10.9. Linear Equations. — Matrix methods are useful in solving and 
discussing linear equations of the form 

A u xi + Ai 2 x 2 + • • ■ + A \ n x n — 2/1 
A 2 iX\ + ^- 22^2 •+■*•■■+’ A 2n X n = 2/2 


A n iX\ + A n2 x 2 + * * * + A nn x n = y n (10-16) 

which are inhomogeneous. 10 They may also be written as 

Ax = y 

The corresponding homogeneous equation is 

Ax — O 


The matrix A and the vector y are to be considered as known while the 
n components of x are unknown. The questions of chief interest concern 
the number of possible solutions and the method of finding them. Several 
cases arise depending on the rank of A, but for our purposes we consider 11 
only three possibilities. 

a. y 0; | A | ^ 0. According to (10), A" 1 exists, hence 

ly 


x = A l j = 


A 


is the unique solution. From the definition of A and the rule for matrix 
multiplication, it also follows that 

(yiA 11 + 1 J 2 A 21 + * • • + y n A nx ) 


Xi = 


A 


which is commonly known as Cramer’s rule. 


12 


10 See sec. 2.5 for the meaning of the term homogeneous. 

11 r ru ^ a:„ — a lv/r a 


T-Ti rrTh Ar A IfTAKrO ’ ’ 


10.10 


MATRICES AND MATRIX ALGEBRA 


314 


b. y = 0; j A | 5 ^ 0. The only solutions are the trivial ones x\ = 

. x 2 = * • • = x n = 0. 

e. y = 0; | A | = 0; A th ^ 0 for at least one value of i and k. If we 
knew the value of one of the unknowns we could find the values of the 
remaining (n — 1) unknowns, since we could then form from the original 
set (?i — 1) inhomogeneous equations with non- vanishing determinant. 
In other words, we are confronted in case (c) with n unknowns but only 
(ft — 1) equations. However, we note that the fc-th row of our set of 
ft equations is 

Ak\X\ + -*-••• + A& n x n = 0 (10-17) 

so that if we take 

Xi = cA ji (10-18) 

where c is any constant, it follows from (3) that (17) is satisfied. Even if 
j = k } we still have a solution, for (17) is then identical with (2) but 
j A j =0. We thus have an infinite number of solutions of the homo- 
geneous equation when \ A \ — 0 as j may take any value from 1 to n and 
c is completely arbitrary. Of course, some of the solutions (18) may be 
worthless, since several of the cofactors may vanish; but it will be found 
that there are always enough non- vanishing ones so that the ratio of all the 
unknowns is determined. 13 The fact that the set of homogeneous equa- 
tions Ax = O possesses non-trivial solutions only when \ A \ = 0 is of great 
importance in many problems and will often be used in the next chapter. 

Problem a. Sometimes chemical analysis must be done in an indirect way. Solve 
the following problem by means of determinants. A mixture of sodium chloride, 
sodium bromide, and sodium iodide weighed 0.5000 gram. Upon the addition of silver 
nitrate, the mixed silver halides weighed 1.0369 g. The iodine in the mixture was pre- 
cipitated as palladous iodide, which weighed 0.3006 g. Find the composition of the 
original mixture. 

Ans. NaCl, 50%; NaBr, 25%, Nal, 25%. 

Problem b. Solve the set of linear equations which result from application of 
KirchhofTs laws to the network known as a Wheatstone bridge, out of balance, and obtain 
the current through the galvanometer. Label resistances R i, /$k, Rz, Ra, going clock- 
wise around the “ diamond.” Let galvanometer resistance be R g . 

^ (R^Ra — RiR$)E __ 

R\R^Rz R^RzRa 4“ RzRaRi 4" RaRiR2 4" Rg (R\ 4~ Ri){Rz 4" Ra) • 
where E is the external electromotive force. 

10.10. Linear Transformations. — Consider a set of linear, inhomo- 
geneous equations, similar to (16), relating m quantities x and h quantities 
x f , which can be written in the form 


315 


LINEAR TRANSFORMATIONS 


10.10 


These equations define a linear transformation , which may be interpreted 
in two different ways, as we shall see later. 

Furthermore, let n quantities x ;/ be related to x 

x” = IApa; p = l, 2, • • •, n 

a 

Then, on combining these equations, we find 

=: ItlApsBaqXq 

s,q 

The same equations could also be written as a single sum 

x'J = ZC pq x q (10-20) 

q 

provided that we agree to take G pq as in eq. (6) which defines the law of 
matrix multiplication. 

The importance of linear transformations thus indicates that this 
definition of matrix multiplication would be a useful one. It could, of 
course, be defined in other ways, and one possibility is 

(AB)ijc = AijBjcj 

where, for convenience, the summation sign is omitted and summation 
ever the repeated index, as in tensor analysis, is required. The consequence 
[)f this definition is interesting, for if we multiply by a third matrix C, we 
find 

(AB • C) ik = (AB)ijC kj = A is B js C kj 
out 

(A • BC) ik = A ia (BC ) k8 = A i 8 B kj C 8j 

md the matrix elements are not the same in the two cases. Therefore, 
die associative law of multiplication would not hold. 

A linear transformation is frequently interpreted as a rotation of 
coordinate axes. In sec. 4.1, we have shown how direction cosines may 
oe used to relate the components of a vector in two different coordinate 
systems. Let us now consider these relations in matrix notation. Suppose 
diat two systems OXYZ and OX'Y'Z' , or bases with unit vectors e and e ; , 
coincide originally and that OX'Y'Z' is then rotated in the positive 
[counterclockwise) direction through the angle <j> about the OZ-axis. A 
/ector x f in the new system is related to the same vector x in the original 
system by an equation similar to (19), which in matrix form would read 

x = B{<f>)x (10~19a) 

Rotate the svstem florin this time thrmurh the an crl e f) to obtain OX n Y U Z U 


10.11 


MATRICES AND MATRIX ALGEBRA 


316 


from which we see that C = AB is a matrix which transforms x directly 
to x ;/ , without passing through the intermediate system in which it was 
called x'. We also see from (20a) that, if physical significance is to be 
attached to the matrices A and B, the order of performing the operations 
involved must be preserved, for in general AB ^ BA although the matrices 
do commute, in the case now being discussed. As suggested by this 
example, we always understand that the order 14 is from right to left , B first 
and then A in the case of eq. (20a). 

Provided C is not singular, we may find C" 1 , hence 

C~V ; = C~ l Cx = x (10-21) 

We may thus use (20). or (21) to determine the components of the same 
vector in either of two different coordinate systems. 

Sometimes one wishes to think of linear transformations in another way. 
Suppose that there is only one coordinate system and that a vector x is 
rotated through the angle <f> in the positive direction to give a new vector y 
in the same coordinate system. If the reader will think about the situation 
for a moment (or draw a simple figure, if necessary) he will be convinced 
that the operation is equivalent to rotation of the coordinate system 
through the angle <t> in the negative (clockwise) direction. The matrix 
elements will therefore not be identical with those used when we assume two 
different coordinate systems. They are, however, closely related as will 
be shown in sec. 10.17. 

10.11. Equivalent Matrices. — Let P and Q be non-singular matrices. 
Then A and B are said to be equivalent when 

B = PAQ (10-22) 

Equivalent matrices have many properties in common as the subsequent 
discussion will show; their importance is due to the fact that it is often 
possible by means of a linear transformation like (22) to find an equivalent 
matrix which has simpler properties than the original one. When the 
equivalent matrix is in its simplest form, usually diagonal, it is said to be 
canonical. 15 The problem of finding an equivalent matrix of canonical 
form is analogous to that of finding a suitable coordinate system in ordinary 
scalar algebra (cf. Chapter 5). 

Several special cases of equivalent matrices are possible, depending on 
the nature of the matrices P and Q effecting the transformation. 

14 The opposite convention is often used and endless confusion may result if this 
fact is overlooked when the equations of various writers are compared. The elements 
of the matrix products, of course, must always be evaluated from left to right, as re- 


17 


BILINEAR AND QUADRATIC FORMS 


10.12 


a. If PQ = E, then 

B = Q~ l AQ (10 23) 

he transformation is cailed collineatory or a similarity transformation 
\hnlichkeitstransformation). The two matrices A and B are said to he 
le transforms of each other. 

b. If P — Q, the transformation is called congruent: 

B = QAQ (10-24) 

c. If P = (? f , then 

B = Q'AQ (10 25) 

e transformation is conjunctive. If the matrices are all real, this becomes 
entical with (24). 

d. If PQ = E, P = Q (i.e., Q is orthogonal) and all the matrix elements 
e real, then 

B = QAQ = Q~ l AQ (10 2(>) 

presents a real orthogonal transformation . It is both collineatory and 
ngruent. 

e. If the matrix elements are complex and PQ = E, P = Q f -- Q ’ l 
e., Q is unitary), then 

B = Q'AQ = (10 27) 

called a unitary transformation. It is collineatory and conjunctive.. 
10.12. Bilinear and Quadratic Forms. — A homogeneous polynomial of 
3 second degree in 2 n variables x ly x 2 , - • -,x„; y u y 2 , • - y n is culled a 
inear form. It may be abbreviated as 

n 

A (x,y) = xAy = T.AijXjjj (10 2Hh) 

.ere A = [Ayy]. If both x and y undergo non-singular transformations 
x = Ac'; y = Qy' 

A(x,y) = xPAQy ' = x'tfy' = A'(x',y') (l() 29 ) 

P = Q *, so that x = 0~ 1 x / ; y = Qy' then x and y are called contra - 
dient variables since they undergo opposite transf ormations. 

As a special case of a bilinear form suppose x = y. Then the cocflb 
it of XiXy (i j) in (29) is (A ij + Aji). and the matrix A becomes 
nmetric if we write (A*y + Ajf )/ 2 for every /l ti and Ay*. Kq. (2Sa) 
y then be written 


the variables, it is called a positive definite quadratic form; if it 
or zero, it is called semi-definite . 

10*13* Similarity Transformations. — Suppose the same vecto] 
when referred to the basis e and i! when referred to anothe 
Let another vector be y or where 

X = Qx'; y = Qy' 

Such variables which undergo the same transformation are i 
cogredient . Now consider the transformation 

x = Aj 

which changes y into x in the basis e. Then, 

Qx' = Ay = AQy' 

or, if Q is non-singular 

= Q~ l AQy' = By F 

Hence, (32) is a transformation which changes y / into x ; in tl 
while A , the transform of B , performs the corresponding tram 
from y to x in the basis e. This is the reason for the name 
transformation. 

An alternative interpretation of the transform may be g: 
x, x ! , y, Y be four different vectors all in the same basis. P 
changes x ; into x and y / into y while (31) changes y into x. 1 
transformation that changes y ; directly into x is (32), since x ; ; 
Q ~ l A = Q~ l AQy f . Here as in sec. 10.10, the form of the matrix 
is similar for different vectors in the same basis or the same 
different reference frames. The matrix elements, however, w 
identical in the two cases. 

10.14. The Characteristic Equation of a Matrix. — If X is a 
rameter, A is a square matrix of order n and B the unit mat 
same order, the matrix 

K = [X E - A] 

is called the characteristic matrix of A. The equation 
if (X) = | if | = \\E - A j = 0 

or its equivalent 

if(X) = X n + aiX n 1 -J- &2X n 2 == 0 

where the a 8 are functions of the elements of A, is the characterist 
of A. The n roots of if (X), X lf X 2 , X 3 , • • •, X n , not necessarily all 

r*h nnr/i ^ n ( r\v 1 i \ CX-n + i i 4-Tkn (s 


L9 


REDUCTION OF A MATRIX TO DIAGONAL FORM 


10.15 


id oomparing coefficients of the different powers of X it will be seen that 
Xi + X2 + • * • + X n = — d\ 

X1X2 + X1X3 + • * * + X*_iXn = 0>2 

X1X2X3 + * • • + X n — 3X71 — iX rt = — as 


XiX 2 X 3 ■ • ■ X n = (-1 ) n a n 

If B = Q^AQ, then [XE - B] = [\E - Q" l AQ] = (T l [X£ - A]Q. 
[oreover, 

| \E - B 1 = | Q- 1 ||X2J - A || Q 1 = \\E - A | (10-35) 

[ence two matrices related by a similarity transformation have the same 
laracteristic roots. 

We leave the proof of the following statements to the reader. 

Tr Q~ l AQ = Tr A . 

| Q-'AQ | = \A | 

r C = A X B f the characteristic roots of C are the products, taken in 
airs, of the roots of A and B. 

Problem. Prove the statements of the preceding paragraph. 

10.15. Reduction of a Matrix to Diagonal Form. — Consider the linear 
■ansformation 

Ax = Xx (10-36) 

he only effect of the matrix A on the vector x is to multiply it by the 
>nstant scalar factor X. Rewriting (36) in the form 

[XE - A]x = Kx = O (10-37) 

e see that, except for the trivial case where all the components of x are 
sro, j K | must vanish (cf. sec. 10.9b, c). Hence, as shown in the previ- 
is section, X can only take the values Xi, X 2? * • *, X n where X* is one of the 
laracteristic roots of K(X). These quantities are the eigenvalues of the 
tatrix A ; the accompanying sets of vectors x are the eigenvectors. Eq. (36) 
the matrix form of an eigenvalue equation, other examples of which were 
Lscussed in Chapter 8. 

Now suppose B is a diagonal matrix; then the roots of its characteristic 
]uation are identical with its diagonal elements. If A is not a diagonal 
latrix but is related to 2? by a similarity transformation, B = Q~ l AQ : then 
follows from (35) that its characteristic roots and equation are the same 
> those of B. The problem of reducing A to diagonal form by means of a 
milarity transformation is thus closely related to the problem of finding 




10.16 


MATRICES AND MATRIX ALGEBRA 


Having found the X t -, we wish to determine a matrix X 

XT 1 AX = A = [XA*;] 

We distinguish the following two cases, 
a. The Eigenvalues are all Different . Let us consid< 
all eigenvalues of A are different. Select one, say \ k , an 
equations 

Ax = \ k x 


They are homogeneous, but as shown in sec. 10.9, we i 
the ratio of the components of the eigenvector x&. 
each component contains an arbitrary constant, we writ 
vector 

X k — { %lki%2k) * ’ ' i%nk\ 

The remaining eigenvectors are determined in the sai: 
eigenvalue in turn. Finally we form a matrix X whoi 
eigenvectors of A. This matrix clearly satisfies the eqi 

AX = X[XA ; ] 

When this result is multiplied by X -1 , eq. (38) is obtain 
shown that the matrix X which diagonalizes A, may 
pounding the eigenvectors of A into a matrix. The re< 
form here described is unique except for the order in wh 
occur along the diagonal. 

Although not required by the method, the eigenvect 
onalized and normalized by the Schmidt process, wh: 
termined constant appearing in the solution of (39). 
question in sec. 10.17. 

b. The Eigenvalues are Not all Different . When 1 
eigenvalues of A are equal to each other, reduction to 
is not always possible. Suppose Xi is an eigenvalue of j 
Proceeding as before, we find an eigenvector Xi so that 

Ax i = XiX L 


Then if is the first column of a square matrix X, the 
will be XjXi and the first column of X~~ 1 AX will consis 
(n — 1) zeros. Call this matrix B: 


B = XT' 1 AX 


X, 

Bj I 

. 0 

Bij J 


i, j = 2, 3, 


Here Bj is a row matrix with (n — 1 ) elements and Bi 


Form a new matrix Y whose first column is that eigenvector. Note 
ver that Y has only (n — 1) rows and columns and 


Y~ l BijY = 


at the matrix 


ransform B into the form 


r Xi 

c k 1 

0 

Clcl _ 


n 


[oM 


Xi 

BjY I 

0 

Xi 

c k 

_o 

0 

Clcl _ 


ontinued applications of similar transformations will eventually result 
single matrix Z such that 


Z~ V AZ 



(10-40) 


Xi 

H 12 

H 13 • 

•• H lr , 

0 

Xi 

h 2Z • 

H 2r , 

0 

0 

Xi 


0 

0 

. 



(10-41) 


Xi 


natrix F is rectangular with r x rows and ( n — r x ) columns while G is 
e and of order (n — r x ). Now if Z x is a rectangular matrix composed 
the first r x columns of Z, then we may remove the unwanted matrix F 
(40), for 

Zr l AZ x = Ai 

he next step is to treat the matrix G in a similar way until it is reduced 
5 form of A\j with its eigenvalue \o along the diagonal. We then con- 
with each remaining matrix until every eigenvalue has been used, 
ly if we join together all of the rectangular matrices Zi to form a 
e matrix W , we will have 

W- X AW = diag (A u A 2j - * *,4 r ) 

5 r is the number of distinct eigenvalues of A. Note that we are using 
otation of sec. 10.7 to denote a diagonal matrix, but in this case the 
nal elements Ai are really matrices themselves. Each is of the tri- 
ar form of (41). 

l the general case, it is nossible to make a further transformation so 


diagonal elements of the triangular matrices may be completely removed 
so that the final form is truly diagonal. We consider these cases in secs. 
10.17, 10.19, 10.20. 


Problem. Reduce to diagonal form 


”8 -8 -2 
4 -3 -2 • 

_3 -4 1 _ 


Ans. Xi = 1; X 2 - 2; X 3 = 3. 


10.18. Congruent Transformations. — When a change of variable, 
x = Qy is applied to a quadratic form, (28b) becomes 


A (x,x) = xAx = yQAQy = yBy 


Thus the transformation of the matrix A which corresponds to this change 
of variable is congruent. Its importance is due to the fact that by its 
use a quadratic form may be reduced to a sum of squared terms, as will 
now be shown. Provided B is diagonal, yBy will be a sum of squares. 
Hence our problem is that of diagonalizing the symmetric matrix A by 
means of a congruent transformation. Suppose A n in A is not equal to 
zero. Then .4 may be written as 


_ "An V “ 
_V A'_ 


where A f is the matrix obtained from A by striking out the first row and 
first column and V = U 12 ,Hi 3 ,- * v 4 ln ]. Now let 


‘1 ~V/ AiC 

E n ^i _ 


where E n __ x is the unit matrix of order (n — 1). Then 


QiAQi 


_ "An 0 
0 A'\ 


and A is a matrix of order (ti — 1 ) whose elements i 


a re 


R.. — a Ai, i+l A uj+ i 

~ ~ * 


An 


and for which the last row and column are designated (n - l;— -not. n. 

e matrix A may be treated in the same way and the process continued 
until A is completely reduced to diagonal form: 

QAQ = diag (ai,a 2 ,- • -,a n ) 

"The proof retires the theory of elementary divisors; sec Turnbull ami Alt ken, 



3 


CONGRUENT TRANSFORMATIONS 


10.16 


The final result is 

xAx = I D§ = + <* 2^2 + • 0 * + 

th 

D = QAQ = [aA>] 

0 = Q1Q2Q3 Qn 

d 

x = Qt 

ie matrix () will have the form 


‘1 

A 12/ An — 

CO 

1 




0 


1 

0 


0 





0 


0 

1 

0 


X 



0 


0 

0 


1 





‘1 

0 

0 


0 


1 

0 

0 • 

• 01 

0 

1 

— B\ 2 / B\\ 


— u 


0 

1 

0 • 

• 0 

0 

0 

1 


0 


0 

0 

1 • 

• 0 

0 

0 

0 


1 


_0 

0 

0 - ■ 

• 1_ 

1 

Q 12 

Ql3 * * * 

Q\n 







0 

1 

Q23 * • ’ 

0,2n 







0 

0 

1 

Qsn 







0 

0 

0 • • • 

1 








The determinants, A m , formed from A by omitting all but the first 
rows and columns are called the discriminants of the quadratic form, 
oreover, as the reader may show (Problem b), 

ai = Ai; a 2 = A 2 /Ai; <x 3 = A 3 /A 2 ,* • • •; a n = A n /A ri _i 

If it is so desired, a further linear transformation 77 ; = will reduce 

x to the form 

T^T] = + ^72 + • * * +7?n 

Assuming that no element A*-; is zero, w r e note that instead of starting 
th An at the first step, as we have done here, any of the remaining 
i, (n — 1 ) in number might have been chosen. At the second step, 
sre are (n — 2) choices available and so on. Thus the final forms of Q 
d D are not unique. When some of the An are zero or when some of the 


Qn = A 13 /A 33 ; Q 23 - A 23 / A 33 . 

Problem b. Verify the relation given in the preceding paragraph between the 
diagonal element oti and the discriminant. > 

Problem c. Reduce the following expression to a sum of squares: 2xf -f- 7z$ + 
3 xl + 4xix 2 + 8x12:3 - 2 x 2 x 3 . For one answer, on = 2; = 5 ; a 3 = —10. 

10.17 Orthogonal Transformations— In this section we limit our dis- 
cussion to real orthogonal matrices, since we shall have no need for those 
containing complex elements. By definition, if R is orthogonal, R = R 1 
hence RR = RR = E. One of their most important properties arises from 
the fact that transformation by them leaves the length of a vector un- 
changed. Suppose x and y are related by an orthogonal transformation 

x = Rj; x = yR 

then 

xx = y RRy = yy 

Our assertion is proved since xx is the square of the length of x and yy is 
the square of the length of y. On expanding RR — E we find that 

2 RpsRqs = 2* RspRsq ~ $pq (10—42) 

s=l s = 1 

These relations are the necessary and sufficient conditions that a matrix 
be orthogonal. 

From the definition RR = E, we also see that |i?|x|tf| = |i?| 2 ==b 
hence 

\ R \ = zkl 

Let us consider two matrices 

cos <j> sin <f> 0 ~ cos cj> sin 0 0 

R^~ = —sin <)> cos <f> 0 ; R~ = —sin <j> cos 4> 0 

o o ij L o o -i_ 

which are easily shown to be orthogonal, the first having the determinant 
+ 1, the second —1. If we refer to eqs. (4-2) and (4-3), we see that they 
are both contained in eq. (42) and that the matrix R~^ represents a rota- 
tion of the coordinate system about 0Z through the angle <£ in the positive 
direction. Similarly, R~ is the matrix for the same rotation, followed by 
a reflection in the XF-plane. These two cases are called proper and 
improper rotations. If <j> = 0 in the latter case, the operation is a simple 
reflection; if <j> = tt, it is called an inversion , the matrix R~ becomes di- 
agonal, with — 1 for its elements, and the result is equivalent to a change 
in sign of the three components of a vector ( x,y,z ). 

Matrices similar to RT are to be used in eqs. (10-20a) and (10-21), if 
Vre interpret the orthogonal transformation as a rotation of coordinate 


axes. However, if we prefer to rotate the vector x and obtain a new 
vector y, as we also discussed in sec. 10.10, we must change the sign of 
the angle in R + . But sin ( — 0) = — sin 0, cos ( — 0) = cos 0 and the 
following results are obtained since the matrices are orthogonal: 

*( 0 ) = R~ l (<t>) = R ^ ( 0 ) = R(-4>) 

The matrix relations are thus 

y = R( — 0)x = R~ l (0)x = R(<j>)x 

and 

x = i?(0)y 

It follows that successive rotations of the same vector in one coordinate 
system to give new vectors y and z must be written in the form 

y = B( — 0)x; z = A( — 6)y; z = A( — d)B( — <t>)x 

or, if we prefer to retain the matrix elements shown in i? 4 ", we must change 
the order of the matrix product to read 

x = R(<j>)R(6)x 

The fact that an orthogonal transformation is both congruent and 
collineatory makes it useful for the following reason: It has been seen 
that the congruent transformation may be used to reduce a quadratic form 
A(x,x) to a sum of squares, but the reduction is by no means unique. On 
the other hand, suppose the quadratic form has been reduced to a sum of 
squares by a congruent transformation and the elements of the transform- 
ing matrix are real. They can then be orthogonalized and normalized 
according to (42) and the resulting matrix R is both congruent and collinea- 
tory (hence orthogonal). In symbols, 

x = Ry ; A (x,x) = x^4x = yR~~ l ARy = yAy 

where A is diagonal with the eigenvalues of A for elements. It will be 
remembered that when a matrix is reduced to diagonal form by a similarity 
transformation, the eigenvectors which form the columns of the transform- 
ing matrix X are not completely determined because the equations to be 
solved for the components of the eigenvectors are homogeneous. This 
arbitrariness now disappears, for we must fix the ratio of the components of 
(42) so that the transforming matrix is orthogonal and RR = E. 

We are now in a position to prove a statement made in sec. 10.15, 
namely that if a matrix A is symmetric and has multiple eigenvalues it 
may still be reduced to true diagonal form by an orthogonal transformation. 
Suppose A undergoes a congruent transformation by the matrix Q. Then 

j 

the new matrix QAQ is symmetric if A is symmetric, for (SAG) = QAQ • 



It thus follows that an orthogonal transiormation win leave the symmetry 
of A unchanged. This only can be true if the off-diagonal elements of the 
triangular matrices of (41) are zero, but then A is diagonal. 

Orthogonal transformations are often called principal axis transforma- 
tions since they are used in the problem of reducing a conic to principal 
axes and in finding the principal axes of a rotating body, or in reducing 
kinetic and potential energy expressions to sums of squared terms. The 
eigenvectors are frequently called normal coordinates in these cases. 19 

A similar procedure serves to reduce 20 simultaneously two quadratic 
forms to a sum of squares. Suppose the two forms are A (x,x) == xAx 
and B(x,x) = xBx. First reduce A (x,x) to a sum of squares by a congru- 
ent transformation, x — Qj , which will give 

xAx = yQAQy = yDy = y[aAj]y 
The same transformation applied to B will give 

xBx = yQBQy = yCy 


but C is not diagonal. Now make the substitution m — \/ <xi\ji which 
results in 


yDy = r\Et\] yCy = 


where the ai have been absorbed into C to give C' . Finally, an orthogonal 
transformation, t] = m will reduce C' to diagonal form, yielding 

yDy = iRERi = + |f + • • ■ + * 2 

yCy = iRC'R$ = |A| = X^f + \ 2 g + • - ■ + X„£ 2 


Even when the two quadratic forms are not functions of the same 
variables, the transformation may often be made. For example, in the 
mechanical problem of small oscillations where it is required to find normal 
coordinates for the kinetic and potential energies, the two quadratic, forms 
appear as T = vAv and V = xBx where v = dx/dt, T being positive definite. 
The reduction causes no difficulty since the eogredient variables x and v 
both undergo the same transformation. 21 

We show in eq. (53) that for a unitary matrix, X,Xf = 1 for every i. 
binee a real orthogonal matrix is also unitary, it follows that the only possi- 
ble eigenvalues for a real orthogonal matrix are ±1 or e**. In the latter 


An n K-^f' C n aPter9; f ?L a fui!er disoussion > see Whittaker, E. T., “ A Treatise on the 
1927 DynamiCS ° f Particles and Kgi d Bodies,” Third Edition, Cambridge Press, 

alWayS P° ssible but it <=an be made if one of the forms is 
2; i as is the case in most physical problems. 

Chapter VIi amP ' e ^ ^ W&S discussed in sec - 9 - n >' see also Whittaker, loc. cit., 



0 must be real and the exponentials occur in pairs with opposite 
If k is the number of eigenvalues equal to —1, the determinant of 
atrix equals (— l) fc . 

the previous discussion of this section, we have shown that when the 
values are real it is possible to reduce matrices to diagonal form by 
3 of an orthogonal transformation. Now suppose that R, the matrix 
reduced, is itself orthogonal and of order n, and that the n eigenvalues 
-1 occurring j x times, —1 occurring j 2 times (j i + j 2 = j < n) and 
Since the latter must appear in pairs there are an even number of 
i.e., n — j — 2m and k = 1, 2, • • •, m. Some simplification of the 
diagonal matrix may be made by noting that if 4>k = 0 or t, e^ k 
3 ±1. Thus we may write an even number of the eigenvalues ±1 as 
entials and obtain the following special cases for A, the diagonalized 
of R. 

q n — 2k 

R | = +1; A = diag (c^«, e 1<t>2 , • • *, e l<i>k 1 e~ l<i>l , • • *, e~ l<t>k ) 

R | = -1; A = diag (1, -1, • • *, e i<t>k ~i, 

> tl — 2k -f" 1 

R | = +1; A = diag (1, e l '*>, • • *, e i<f>k , e"***, • • •, e^ k ) 

R | = -1; A = diag (-1, e*., ■ • e l \ e~ l S • • •, e~^) 

(10-43) 

ow consider the form of the matrix X which diagonalizes R: 

X~ l RX = A 

:h column is an eigenvector x r of R and its eigenvalue will be assumed 
But according to (36) 

Rx r = e i4>r x v (10-44) 

x r will in general be complex and of the form 

/ . . // 
x r = x r + 

; x' and X'' are real. In a similar manner, it follows that the eigen- 
r and column of X corresponding to e~~ l<t>r is x r — ix/ . We conclude 
lsually the transforming matrix X will have some complex elements, 
ling the fact that in the previous case, where X was orthogonal, the 
'ormation was both collineatory and congruent we see that the neces- 
nodification here is that the transformation be collineatory and con- 
ve. The transforming matrix, then, is unitary, hence we can only 
cialize an orthogonal matrix by means of a unitary matrix, 
it us see what would happen if we transform a real orthogonal matrix 


J.U.JLO 


MATKICrJiS AIN L? MA I KIA a JLAjrJL m 


R by another real orthogonal matrix S. 

Sr-'RS = Z 


We write (44) as 


R(x'r + iXr) = C^ r (x' + llLr') 


for the particular eigenvalue e l< ^ r . Since e l<t>r = cos <p r + i sin <j> r * 
equate the real and imaginary parts of (45) to get 

Rx r = x' r cos 4> r ~ x! r / sin <p r 
icq = x r sin cf> r + x,. cos 4> r 


with a similar expression for the column of X that comes from e 
we replace the complex* eigenvector x r = x' + zx^ by x r and x r — 
x r the resulting matrix contains only real elements and may b 
diagonal by requiring that (42) be fulfilled. Let us call this ms 
Transformation by it will give the following forms for Z: 

n even , n = 2 k 

| R | = -{- 1 * Z = diag (CjA,- • *,C^) 

| | = -1; Z = diag (1,- l.CiAv • -,0^) 

n odd , ?2 = 2A* + 1 

1 A j = +1; Z = diag (l,Ci,C 2) - • -,C k ) 

I « ! = -1; 2= diag (— l,Ci,C 2> - • -,C k ) 

where c* = [ C0S 4>k sin H 

L-sunfo. COS^J 

It is worth while to point out that the only other possible tw r o- 
sional real orthogonal matrix is of the type 

[ cos cf> sin <p 
sin (j> — cos <£_ 

Its eigenvalues are ztl, hence such matrices cannot occur in the i 
form of R as we have already included all real eigenvalues in the pr< 
expressions for Z. 

Problem. Prove that R~ and R~ are orthogonal matrices. Reduce 
diagonal form. 


10.18. Hermitian Vector Space. — Since many of the matrices 
ring in physical problems”" contain complex elements, it is necesj 

22 For the use of matrix theory in quantum mechanics, see Chapter 11. Foi 

dlSCliasifm . SAA TWn t ) ((Pi , ^ 


wmc m piace oi 


aiupuiy Idle VCULUi tuintpu piLotnicu xii oti'u. J-'U.^. 

(11), the Hermiiian scalar product 

x r y = £*2/i + £*2/2 H h £*2/« (10-46) 

The square of the absolute length of a vector is then real, 
x f x = x*x 1 + x$x 2 + * * • + X*X n 

If x + y = yfx = 0, the two vectors are orthogonal or mutually perpendicu- 
lar. If x + x = 1, the vector is a unit vector or normalized. For a scalar a, 

x f ay = ax f y; (ax) f y = a*x f y 

The Hermitian scalar product is associative 

x f (y + z) = x f y + x f z 

If A is any matrix, 

xMy = (A'xYy, (Ax^y = x+^y) (10-47) 

10.19. Hermitian Matrices. — If the variables in the bilinear form (28) 
are complex conjugate to each other and if its matrix is Hermitian, the 
form is called Hermitian . Thus, 

//(x,x) = ZHijxUj = x f Hx; H {j = H% (10-48) 

i,3 

In spite of the fact that the elements of (48) are complex, the form itself 
is real. 

The eigenvalues of an Hermitian matrix are also all real. Suppose Xj 
is an eigenvalue corresponding to an eigenvector x, then 

Hx = X t x; x f Hx = Xpdx 

Since x*Hx and x i- x are both real, it follows that X; is real. 

An Hermitian matrix H remains Hermitian when transformed by 
either an orthogonal or a unitary matrix. To prove this statement for a 
real orthogonal matrix R, suppose H\ is known to be Hermitian. and 
R~ l H\R = Ho. Then, since R = R~\ we have Go = RG\R~ ] = RG\R 
and Hi = R*H\R*. But H\ = Hi and R is assumed to be real, so R f = R } 
R* = R. Thus Hi = RH\R = ir l H x R = Ho. The proof for a unitary 

matrix is similar. 

As we have previously stated, a real symmetric matrix is a special case 
of the Hermitian matrix. Thus, except for slight modifications, the reduc- 
tion of Hermitian matrices to diagonal form is similar to the procedure 
used for real matrices. For example, an Hermitian form may be con- 
verted to a sum of squares in many ways by a conjunctive transformation. 
On the other hand, we saw in sec. 10.15 that a matrix could be converte 
wi+h picrp.nvfl.lncR on the diagonal, by means of a col- 


10.20 


MATRICES AND MATRIX ALGEBRA 


330 


lineatory transformation. If the matrix is Hermitian, we may require 
that the transformation be both eoliineatory and conjunctive, hence uni- 
tary, and the diagonal form is then unique. The same argument which 
was used for a real symmetric matrix shows us that even if the eigenvalues 
are not all different, the true diagonal form may be obtained since trans- 
formation by a unitary matrix leaves the symmetry of an Hermitian matrix 
unchanged. 

The necessary condition that two Hermitian forms be simultaneously 
reducible to a sum of squares is that they commute. 23 Suppose that 
H(x,x) = x^Hx and K(x,x) = x f Kx are given and that both H and K are 
Hermitian or unitary. 24 Let S be a unitary matrix that reduces H and K 
simultaneously to diagonal forms, H f and K r : 

H' = ST l HS; K r = S~ l KS 

Clearly H f and K r commute since they are both diagonal, hence we may 
write 

H'K’ = S~ ] HSS~ l KS = S~ l HKS 

K'H' = S~ l KSST l HS = S~ l KHS 
or 

S-'HKS = S-'KHS 

since K.’ H’ = H'K'. It thus follows that HK = KH. 

Problem. Prove that an Hermitian matrix remains Hermitian after transforma- 
tion by a unitary matrix. 

10.20. Unitary Matrices. — If we indicate a unitary matrix by U, then 
from its definition 

v - (UT 1 

hence 

W = IT 1 ] UW = UW = E (10-49) 

Suppose the elements in a single column of U are given by Uy, then the 
Hermitian scalar product of two columns 

UjTJ* = 8 jk 

A similar relation may be found between the rows. Hence the rows and 
columns of a unitary matrix of order n form a set of n mutually perpendicu- 

23 The sufficiency of this condition is proved by Weyl, H., “ The Theory of Groups 
and Quantum Mechanics,” Methuen, London, 1931. 

2-i If t.hfiSfi \vprf> nnt. T-fprmitinn nr nnit.nrv nmt.Viftr nf f.mild hp. rodlinod 


unit vectors in Hermitian space. This may be seen at once by writing 
1) explicitly: 

'LU va U*„ = T.U%U sq = 8 pq (10-50) 

« 8 

Lese equations are analogous to (42) for orthogonal matrices. 

If x and y are any two vectors, 

( Uxyi/j = x t (Z7 t t/y) = x ? y 

ace a transformation by a unitary matrix leaves a bilinear or quadratic 
m invariant. In particular, if x = C/y, then 

x + x = (UjyUj = y f y 

is is the analogue of the fact that an orthogonal matrix in real vector 
ice leaves the length of a vector unchanged. In fact, the unitary matrix 
Hermitian vector space is the generalization of the orthogonal matrix 
real vector space. 

The product of two unitary matrices U and V is also unitary : 

(uvy - vm = v~ l u - 1 = (uvr 1 ( 10 - 51 ) 

e reciprocal of a unitary matrix is unitary : 

(ir l y = (wy = u = (ir 1 )- 1 ( 10 - 52 ) 

The eigenvalues of a unitary matrix may be real or complex but of 
solute value 1. Suppose X* is an eigenvalue of U , then 

Ux = \iX] (Uxyux = x f x = X^Xfx+x (10-53) 

ce x t x is real and does not vanish, X Z X* = 1. 

A unitary matrix may be transformed into diagonal form by another 
tary matrix V , the diagonal elements being the eigenvalues of U. The 
icedure is similar to that for similarity and orthogonal transformations, 
e eigenvectors must be normalized to satisfy U*U — E. The result is 

V~ l UV = VWV = A = diag (X 1; X 2 ,- • *,X n ) 

10.21, Summary on Diagonalization of Matrices. — The matter of 
gonalizing matrices is so useful in practice that a final and simple 
tement regarding conditions for the feasibility of this reduction seems 
?rder. 

A matrix may be diagonalized (a) if all its eigenvalues are distinct (for 
cedure, see sec. 10.15a), (b) if it is Hermitian or symmetric (see sec. 
16 and 10.19), (c) if it is unitary (see sec. 10.20). In cases (b) and (c) 
nitary matrix can always be found to effect the transformation while in 
a more ereneral tvne of transforming matrix will be needed. 


MATRICES AND MATRIX ALGEBRA 


332 


REFERENCES 

Aitken, A. C., “ Determinants and Matrices,” Interscience Publishers, Inc., New York, 
1948. 

Frazer, R. A., Duncan, W. J., and Collar, A. R., “ Elementary Matrices,” Cambridge 
University Press, New York, 1938. 

Kowalewski, G., 11 Determinantentheoric einschliesslich der Fredholmschen Determin- 
anten,” Third Edition, Chelsea Publishing Co., New York, 1942. 

MacDuffee, C. C., “ The Theory of Matrices,” Second Edition, Chelsea Publishing Co., 
New York. 

Muir, T., “ Theory of Determinants,” London, 1906-1923. 

Perlis, S., “ Theory of Matrices,” Addison-Wesley Press, Inc., Cambridge, 1952. 

Schreier, O. and Sperner, E., “ Introduction to Modern Algebra and Matrix Theory,” 
Chelsea Publishing Co., New York, 1950. 

Wade, T. L., “ Algebra of Vectors and Matrices,” Addison-Wesley Press, Inc., Cam- 
bridge, 1951. 



CHAPTER 11 


QUANTUM MECHANICS 

11.1. In conformity with the scope of this book, the emphasis of the 
*esent chapter is on the mathematics of quantum mechanics, the physical 
eas entering the discussion only in a secondary way. Limitation of space 
rther demands that only the important, and this happily implies the more 
^mentary, portions of the wide field be presented. Complete exclusion 
physical ideas would, however, leave its subject matter so poorly joined 
id so incomprehensible to the student who has no prior knowledge of 
lantum mechanics that the value of an entirely formal treatment appears 
lestionable. It is also true that no part of applied mathematics exacts 
its student a more radical change from his customary habits of 
ought, a greater tolerance for new methods of inquiry, than does this 
best branch. In order to provide the proper attitude of mind, we preface 
e later mathematical developments by a few qualitative remarks whose 
levance to the present book is but auxiliary. 

The central notion of classical mechanics is the mass point, or particle, 
'assical theory therefore presupposes, tacitly, that a physical system can 
principle be recognized as a particle, or a set of particles. Until the 
Ivent of quantum physics this dogma has never been questioned; in 
ct scientific philosophers have frequently inflated it to the dimensions of 
universal proposition claiming that all physical systems are composed of 
irticles. The method of physical description in best accord with this 
ndamental attitude is clearly this: To correlate instantaneous positions 
a given particle with instants of time, assuming motion to be continuous 
space and time. Thus, if a particle moves along the X-axis, the com- 
ete description of its motion would appear in the form x = fit). 

Now it is conceivable that such a correlation becomes impossible, and 
e question then arises whether this fundamental mode of description 
ould be abandoned in such circumstances. The answer which has often 
en given and which the modern physicist emphatically rejects is the flatly 
gative one, the answer alleging that classical description is intrinsically 
ident and that the relation x = f(t) has meaning even when the func- 
mal relation cannot he established. On the other hand, one would not 


1,1 


QUANTUM MECHANICS 


334 


5 a function of t. The criterion which has ultimately produced clarity is 
his : A method of description must be abandoned when it becomes impossi- 
ble, not because of experimental difficulty, but because its use contradicts 
mown laws of science. Classical description has become impossible for the 
alter reason, as the following simple example will show. 

Imagine an oscillating mass point, e.g., the bob of a pendulum. As 
ong as the eye can follow the bob, correlations between x and t can cer- 
ainly be made. But suppose the mass point is made to increase its fre- 
luency of vibration. The eye will soon be unable to perceive instantane- 
ous positions, but the camera can still establish them. When the camera 
ails, oscillographic methods may be available, and after that, ingenious 
levices perhaps not yet invented may serve. But ultimately, a barrier 
>f an essential kind will be encountered. Let us assume that the bob 
oscillates 10 10 times per second. It is a fact of atomic physics that visible 
ight requires about 10"" 8 seconds to be emitted (or reflected). Thus if it 
vere used as the medium of report, the light-emitting mass would have to 
emain in a given position for approximately that length of time. In the 
jresent instance, however, the bob executes 100 vibrations within this 
>eriod. A similar argument can finally be used to invalidate every other 
neans for establishing the classical correspondence. The latter has to be 
iltimately abandoned because its use contradicts the laws of optics. 

What, then, can be done? Perhaps the example suggests an answer. 
Nhile a snapshot can in principle no longer be taken of the rapidly oscillat- 
ng bob, a time exposure would reveal some features of its dynamical 
>ehavior. It would give essentially a correlation between the time the 
>ob spends within a given interval dx and the location of that inter- 
nal, in other words between x and the probability wdx of encountering it in 
lx. This leads to a less pretentious description of the physical system 
sailed a mass point, of the form iu = p(x), and this description is charac- 
teristic of quantum mechanics. It is to be noted that p(x) can be inferred 
rom the classical relation x = f(t), but not/(2) from w — p(x). 

Quantum mechanics provides the means for deducing probability rela- 
tions of the type described, and it does so in a logically consistent fashion. 
But before turning to this central issue, let us see what has become of the 
joncept: particle. Our time exposure has left it very ill defined. Indeed 
f the system called a mass point were invisibly small or never sufficiently 
stationary to permit the classical description, the customary properties of 
^articles would never be exhibited. By the criterion of essential observa- 
Dility, the concept would lose its physical significance. From, a misunder- 
standing of this situation there has arisen a claim that quantum mechanics 


335 


DEFINITIONS 


11.2 


is that they are neither particles nor waves, but more abstract entities for 
the description of which quantum mechanics gives most simple and success- 
ful rules. The question as to the particle or wave nature of an electron 
must be put in the same class as that regarding its color — or, to use a lighter 
metaphor due to the philosopher Dingle, as the question concerning the 
color of an elephant's egg if an elephant laid eggs. 

Despite this fundamental situation we shall place no ban upon the use 
of the terms particle, wave, etc.; we shall even adhere to universal practice 
in calling the electron one of the elementary particles of nature ; we do this 
only, of course, as a concession to usage. But whenever a paradox arises, 
the reader should endeavor to resolve it by recalling that the “ classical 
language ” when applied to atomic entities is in fact metaphoric. 

AXIOMATIC FOUNDATION 

11.2. Definitions. — For the sake of brevity all historical considerations 
are omitted here. Nor will any attempt be made to “ deduce ” quantum 
mechanics either from classical physics or from outstanding experimental 
facts, for in a strict logical sense this cannot be done. We shall, however, 
present the framework of the theory with utmost economy of thought and 
space, committing the reader to the tacit understanding that all experi- 
mental consequences of the theory outlined have been verified as far as 
they could hitherto be tested. 

On a physical system , by which is meant any object of interest to physics 
or chemistry, numerous observations or measurements can be made. The 
quantities so observed or measured, such as size, energy, position and 
momentum, are called observables. It is well to think of these observables 
without ascribing to them the intuitive qualities they possess in classical 
mechanics. Position, or energy, is not so much possessed by a system as it 
is characteristic of a certain measuring process which can be carried out upon 
it. The measurement of an observable upon a system yields a number. 

In defining the state of a physical system considerable caution must be 
exercised, for we wish to remain in keeping with the requirements outlined 
in the introductory paragraphs. First it is well to notice that by state the 
scientist never means anything not subject to arbitrary fixation; indeed 
the definition of state is made to conform to the needs of each particular 
subject. It is quite different, for instance, in classical mechanics from 
what it is in thermodynamics or in electrodynamics. Hence we need not 
feel ill at ease when in quantum mechanics a new choice is made. Leaving 
elucidation until later: a state is 1 a function of certain variables , a function 



obtained, me variaoies may oe cnosen m several ways, eacn g 
a consistent description equivalent to all others; here they will 
be space coordinates, for this gives rise to the form of quantun 
most commonly used, namely Schrodinger’s. By state, or sta 
we thus mean a mathematical construct, 4 >{^uViyZi) £2,2/2,22; * 
It is possible, as we shall later see, to associate the variables X\ 
the dimensions of configuration space of the classical analogue o: 
in question. In particular, the number of variables needed 
complete description of its behavior (at a given instant of time} 
been found to be equal to the number of its classical degrees 
This must indeed be the case in order that large scale bodies be 
described both by quantum mechanics and by classical mechan 
may change with time; hence a state in its widest meaning ma; 

</>(£iyi2l * • * *n,t) 


Certain restrictions are to be placed upon state functions, 
which will take on greater plausibility in view of the postulates 
section. Most important among them are two: first <£, whic 
complex function, must possess an integrable square 2 in the sen 



< 00 


where dr is the “ volume of configuration space,” 
coordinates 


Second, 


dr = dxidyidzi • • • dx n dy n dz n 


<j> is single-valued 


i.e., in 


The function <f> may of course be expressed in any other syst 
coordinates by the ordinary geometric transformations of 
Condition ( 2 ) is particularly important when one of the vai 
angle, say a, for it then requires that 

<t>(oc) = <j)(a + 2n7r) 

n being an integer. 

Finally we must include in our list of definitions another m 
construct, that of an operator . Every specific mathematica 
like adding 6 , or multiplying by c, or extracting the third root, 

2 This statement requires modification in some cases. See remar 
u continuous spectrum,” sec. 11.9c. Condition (1) must be rigorous] 

without exception when f dT is finite. It seems best to present the four 

theory with this restriction, leaving necessary generalizations for later. 


POSTULATES 


11.3 


esented by a characteristic symbol which is then called an operator. 
3/- d f h d 2 d 

rators are: 6+, c*, v , — , § dt, + B— + C, and so forth. 

ax J a dx dx 


general they act on functions. They can be applied in succession, 
in they are so applied, the order in which the operators occur is impor- 
. For convenience, let us use more general symbols for operators, such 
' and Q. If P stands for a+ and Q for c«, then PQf means a + cf where 
i function; however QPf means c(a + /). Thus 

QPf = PQf + (c - 1 )a (H-4) 

i an equation is said to be an operator equation. The reader will at 
i verify that, if P stands for d/dx and Q for :r*, the operator equation 

PQf - QPf = f (H-5) 

lS. 


[here is an important difference between eqs. (4) and (5); the second 
>mogeneous in/, the first is not. From the second, / may be canceled 
bolically so that it reads 


PQ — QP = 1 (11-5) 

Y homogeneous operator equations of this kind, usually written in the 
3r form without explicit insertion of the operand /, are of interest in 
utum mechanics. 

Phe formalism of operators is convenient also in other ways. It is 
iible, for instance, to define a periodic function 0(x) by writing 

e hD 4>{x) = 4>(x ) 


eing d/dx ; for the left-hand side is, on expansion, simply the Taylor 
is for <f)(x + h). 

Two operators, P and Q, are said to commute when PQ — QP is zero. 
ls C‘ and d/dx commute if c is a constant. Other examples of commut- 


operators are: and d/dy; 


d/dx and 



dx if a and b are constants; 


and ( — 6). Clearly, every operator commutes with itself or any power 
self, provided that by the n-th power we mean the n-fold iteration of 
operator. 

L1.3. Postulates. 3 — a. The fundamental postulates of quantum me- 
lics are three in number. The first concerns the use of observables. 


! Henceforth in the present section, and in all subsequent sections up to 11.25, 
is will be supposed to be independent of the time; i.e., <f> does not contain L Such 
:S are known as slationary ones, and the part of quantum mechanics dealing with 
l will be called quantum statics. In quantum dynamics , introduced in sec. 11.25, a 
postulate (Schrodinger’s “ time ” equation) will be needed. This postulate is not 


11.3 


QUANTUM MECHANICS 


338 


Brief reflection will show that classical physics associates with observables 
certain definite functions of suitable variables: x } y f z with position, mv 
with linear momentum, \mv 2 with kinetic energy, and so forth. These 
functions are chosen to describe experience most adequately. There is no 
logical- reason which would exclude the use of more abstract mathematical 
entities in this association. It has indeed been found that, for the descrip- 
tion of atomic phenomena, certain operators should replace the functions 
which in classical mechanics represent observables. The first postulate 
may be stated as follows : 

To every observable there corresponds an operator . 

The correct operator to be associated with a given observable must be 
found by trial. In the following table we give a brief summary of the four 
most important operators of quantum mechanics; the observables in 
question are understood to refer to systems classically described as groups 
of mass points having 3 n degrees of freedom (j = 1 , 2,- • -,n), subject to no 
external forces (total energy constant) and not requiring relativity treat- 
ment. The first column gives the name of the observable, the second its 
classical representation, the third its quantum mechanical representation. 

Cartesian coordinate x / Xj- 

Cartesian component of 
linear momentum of y-th 
particle 

^-component of angular 
momentum of j-th particle 

Total energy i £ A ^ + £ + 

i m i 

+ F(X! • • . Zn) 


my is the mass of the j-th particle; h is an abbreviation for Planck's 
constant, h, divided by 27 r. 

The operator form of the Cartesian coordinate is identical with its 
classical representation and has been included only for formal reasons. 
Linear momentum, a differential operator, is basic in the construction of 
the last two entries in the table. 

When the operator corresponding to the linear momentum p of a single 
particle is written in the vector form —ihV, those corresponding to angular 


2 \dXj 

d 2 d 2 \ 

+ ^ + V 

+ V (Xi • • • Z n ) 



«9 


POSTULATES 


11.3 


■nergy = (1/2 m)p 2 + V = — (h 2 /2m)V 2 + F. These vector forms are 
ralid in all other systems of coordinates and should be used as the basis for 
ransformations. 

In view of the table, the reader will easily verify the following operator 
quations : 

Let Qk stand for the operator “ k - th Cartesian coordinate,” Pk for the 
;-th component of linear momentum. Then 

PkQi — QiP k = —ihh k i (11-6) 

dso, if L X) L v and L z denote the components of the angular momentum 
perator for a single particle, 4 

LxTy — LyL x =:: zhL z 

][j yl-J 2 TjjtLy — ihL x (11 4 ) 

L z Lx LxTz ihLy 

Commutation rules, like (6) and (7), are often sufficient to define the 
perators involved without recourse to their explicit form, but the latter is 
sually helpful. 

b. The second postulate states: 

The only possible values which a measurement of tloe observable whose 
perator is P can yield are the eigenvalues p x of the equation 

P\p\ = p\\p\ (11-8) 

rovided obeys conditions (1) and (2), namely: J* ip*i/\dr < co and \p\ is 
ingle- valued. 

The range of integration depends on the particular problem under con- 
deration, as will be seen later. 

We illustrate the meaning of this postulate by a few examples. Let us 
nd the measurable values of the linear momentum of a particle, known to 
e somewhere on the X-axis between the finite points x = a and x = b. 
'he operator P is —ih(d/dx). Eq. (8) therefore becomes a first-order 
ifferential equation which can obviously be satisfied if is assumed to be 
function of x only. It reads 

ad has the solution 


Is this solution satisfactory from the point of view of eqs. (1 

It is certainly single-valued; moreover, J* 4/f\p\dx = (6 — a)c 

for every finite c. Hence no restriction upon p\ results; all va 
linear momentum may be found upon measurement. The eigc 
the linear momentum form a continuous spectrum (X is not 
index) and every function of the form ce {t/h)px with constar 
eigenfunction. As far as measurable values of linear momenta 
cemed, quantum mechanics leads to the same result as classical 
This is not true for the angular momentum of a single parti 
eq. (8) reads 

/ d d\ 

-ifi Is- y — 1 


provided we consider the 2 -component and write m\ for the ei 
Obviously, must be a function of both x and y. But a sin 
formation of coordinates reduces the equation to a simpler 
putting x — r cos 0 and y — r sin 0, we have 


d_ 

dd 


—r sin 6 — 
dx 


+ 


d 

r cos 6 — = x 
dy 


d_ 

dy 



Therefore eq. (10) becomes 





and x[/\ is seen to be a function of 0 alone. The solution is 

l£ X = Ce (’/*)T» X * 

It certainly has an integrable square, because the range of 0 exte: 
to 2t, or more exactly, from 2 irn to 2ir(n + 1), where n is an int( 
yf/\ violates the condition of single-valuedness which must be i: 
the form (3). To satisfy it we must require that 

+ 2 tt) 

and this implies e (27rt/ft)mx = 1. This is true only if 

rax = Xft, X an integer 

Hence the only observable values of the angular momentum an 
(11), and the eigenfunctions are ce lXd . This result is identical 
postulate of the older Bohr theory concerning angular momentun 
Next we consider the possible values of the total energy of a s 
noint. The enererv onerator annearine: in the table is often ref e 


POSTULATES 


11.3 


Hamiltonian operator and is denoted by the symbol H. Let us use E\ 
the eigenvalues. The operator equation then becomes 

E'f'x - ~ ^ V Vx + v (x,y,z)A = (11-12) 

5 equation, written perhaps more frequently in the form 

2 ??? 

vVx + -^2 (E x - V)ip\ = 0 (11-12) 

found by Schrodinger and bears his name. Its solutions and eigen- 
Les clearly depend on the functional nature of V ( x,y,z ) ; they will be 
rved for detailed consideration in secs. 9 et seq. 

rather peculiar result is obtained w r hen (8) is applied to the coordinate 
>erator.” The eigenvalues of “x” are the values for which the 
ition 

x • ^x = £x^x 

rdinary algebraic one, possesses solutions. On writing it in the form 

0 - £x)^x = 0 

evident that either x = £x or \{/\ = 0. In plainer language, as a 
stion of x vanishes everywhere except at x = £x, a constant. From a 
rous mathematical point of view such a function is a monstrosity, but 
useful for certain purposes to introduce it, as Dirac 5 has done. It is 
;d 8(x — £x), the symbol being fashioned after the Kronecker 5, and is 
visualized as something like lim cc~ (x ~* x)Va . For later use the con- 

a— ► 0 

t c(a) will be so chosen that J 5(x — %)dx == 1, so that 

f f(x)8(x - H)dx=m (11—13) 

r it is clear that such a “ function ” can be formed for every value £x, 
;e every point of the X-axis is an eigenvalue of the ^-coordinate. 6 
Che significance of the second postulate is best grasped when it is 
rded as furnishing a catalogue of the measurable values of all observa- 
for which operators are known. It implies no information concerning 
meaning of the eigenfunctions These are, of course, states of the 

Dirac, P. A. M., “Principles of Quantum Mechanics/’ Third Edition; Clarendon 
3, Oxford, 1947. 


system in the sense explained. Their nature will unfold itself 
third postulate has been set forth. For the present we only 
every 0\ is indeterminate with respect to a constant multiplie 

will also be satisfied by constant • 0\. On the other hand, 

exists. We may require, therefore, that \y\ is normalized after i. 
of sec. 8.2. Henceforth this will be assumed unless a statement i 
trary is made. In this connection it may be recalled, how 
normalization may fail intrinsically when the eigenvalues p\ fc 
tinuous spectrum. In Chapter 8 this was shown to be the case ir 
where the range of the fundamental variable became infinil 
require special treatment. 

The \p\ will be orthogonal if operator and boundary conditior 
to the circumstances of the Sturm-Liouville theory (sec. 8.5). T 
as will later be seen, covers most of the cases occurring in quantm 
ics, but must be generalized somewhat to be applicable to comf 
tors. 

c. We turn to the third postulate which states: 

When a given system is in a state 0, the expected mean of a , 
measurements on the observable whose operator is P is given by 

p = j* <t>*P(f)dT 


The expected mean is defined as in statistics : If a large number c 
merits is made on the system, and the measured values are pi,2 

N 

then p = l/N £ Pi- Note that eq. (14) does not predict the oi: 
single measurement. 

In writing (14) we are again supposing that 0 is normalized, 
be brought about in all physical problems by “ confining ” the 
configuration space, that is, by taking the volume in which it rr 


finite, so that 


/ 


dr exists. 


Even if the volume is infinite, 



still exist, but in general the situation then calls for special tre 
volving the use of eigendifferentials instead of eigenfunctions, 
general form of eq. (14), which often works when the volume of 
tion space is infinite, is the following 


POSTULATES 


11.3 


13 

We illustrate the meaning of (14) by a few examples. Let a 
astern having one degree of freedom be in a state described by 
= (6/ 7r) 1/4 6“ (6/2) Then the mean value of its position will be: 


s mean momentum : 


px 


: = J* 4> 2 xdx = £ 

= —ih J <p<t> f dx = 0 


s mean kinetic energy: 


^ = ~ tif^" dx = LI W)2dx = \ 


b 

2m 


It is interesting to note that, the more concentrated the function <t> 
ihe greater b) the larger will be the mean kinetic energy. To calculate 
xe mean total energy we should have to know the form of V (x). 

Let us take <j> = e xkx /(b — a) 1/2 We then find 


x — 



b + a 
2 


px = 



cj)*(p r dx = kh 


&kin 



<j>*(j) rr dx 


kH* 

2 m 


If in this example the range is extended to infinity, let us say in such a 
ay that — a = 6 —> oo, the function e xkx can clearly not be normalized 
ne must then use eq. (14 7 ) in the form 


p = lim 


j 4>*P<t>dx 

J 4>*<t>dx 


tiich gives the same results as those obtained above. 

The three postulates here stated and exemplified do not reveal an 
tuitive meaning of the state function <f>. It is therefore not unusual in 
xtbooks on quantum mechanics to add another postulate stating that 
’(x)<j>(x) signifies the probability that the “ particle ” whose state is 


11.4 


QUANTUM MECHANICS 


not a further postulate since it may be deduced from those alrea 
(Cf. sec. 6.) 

DEDUCTIONS FROM THE POSTULATES 

11.4. Orthogonality and Completeness of Eigenfunctions.-- Ii 
8, orthogonality and completeness of the eigenfunctions belong! 
Sturm-Liouville operator L have been discussed. The proofs tl 
need to be generalized if they are to be applied to quantum meet 
the operators occurring there are not all of the same structure as 
of the most important equations encountered, the one-dimensioi 
dinger equation (12), is of the Sturm-Liouville type.) They of to 
many variables, they may be differential operators of the first oi 
may be complex; in fact they may not be differential operat< 
To simplify the theory we shall assume that the eigenvalues p\ 
are discrete, and that the boundary conditions on acceptable s' 
tions are of the form 1 and 2. Whenever convenient we shall ev« 
that <j> vanishes at the boundary of configuration space, over w 
grations are to be carried out, in a manner suitable to our needs 
these restrictions are made the arguments become involved an< 
respects problematic. It would then be necessary to conduct t 
proof for every problem of interest; thus elegance would fall pre 

We first define what is meant by an Hermitian operator. L< 
be two u acceptable ” functions, defined over a certain range of 
tion space r. We then say that the operator P is Hermitian if 

u* • Pvdr = J v P*u*dr 

All operators of interest in quantum mechanics have this propel 
sample proof we show this for the linear momentum Pj = — 
associated with the j-th Cartesian coordinate: 



First perform the integration over gy, which yields 

—in f u*vdqi • • • dq^dq^i * * * dq n + ih f v — u*d' 

J T dqj 

The first integral, a “ surface n integral taken only over n - 
nates but with u and v evaluated at the end points of the rai 
will vanish provided u and v vanish sufficiently strongly for the* 
values of gy, which is what we are supposing. The remaining 


The Hermitian property of x * is obvious. To prove it for the Hamil- 
tonian H, two partial integrations are necessary; the details may be lef 
as an exercise for the reader. 

Hermitian operators have real eigenvalues . This fact follows at onc« 
from eq. (15) . The eigenvalues of P are defined by the equation 

P^x = Px^x (H~16 

This also implies the validity of the equation 

P*tt = pUt (11-17 

Now multiply (16) by and (17) by \p\, and integrate over dr obtaining 

f tfPfrdr = p X f tiMxdr 

f \p\P*ip*dT = V*f 'f't'Pxdr 

By (15) the left-hand sides of these two equations are equal, for is cer 
tainly an acceptable function in the sense outlined before. Hence p* = p* 
i.e., p\ is real. Since the eigenvalues of operators are measurable values o 
observables, which must of necessity be real, the physical significance of ai 
operator is assured when it has the Hermitian property. 

Let us again consider eq. (16). If is some other eigenfunction, it i: 
evident that 

f tfPtxdr = P\f tfhtir (11-18 

But if we start with the equation 

p*€ « ptf 

which is true because p M is real, we also conclude that 

J ^xP*^Jdr = p»J 'Pt'Pxdr (12-19 

Combining (18) and (19) we find 

f 'P*P4'\dT - J = (Px - p,) f tfAdr 

If P is Hermitian the left-hand side vanishes. Hence either px = p M o 

J* = 0. We see that eigenfunctions of Hermitian operators, belong 

ing to different eigenvalues , are orthogonal . 

The completeness of the eigenfunctions of all operators employed i 
nnantum mechanics is usuallv assumed. To the authors’ knowledge. 


11.6 


QUANTUM MECHANICS 


346 


rigorous proof has not been given. Since, however, our main interest will 
be in the Schrodinger equation which is of the Sturm-Liouville type, this 
point need not detain us further. In the following we shall assume com- 
pleteness of all \f/\ whenever this property is needed. 

Problem. Show that the angular momentum operator L z — —ih (6/ 36) is Hermitian. 

11.5. Relative Frequencies of Measured Values. — Important conse- 
quences can now be deduced from the third postulate, eq. (14). We first 
note that, if P is Hermitian, every power of P is Hermitian. Moreover, if 
(14) is true for every operator P, it must certainly hold for the operator 
P r . It implies, therefore, 

f - J 4>*P r <pdT, r = 1, 2, • • • (11-20) 

The left-hand side stands, of course, for the r-th moment of the statisti- 
cal aggregate of the measured values, i.e., 

7 = ZPrrf (H-21) 

i 

provided p; is the relative frequency of the occurrence of the i-th eigen- 
value pi in the set of measurements. In accordance with eq. (20), the 
state function <j> predicts not only the mean, but all moments of the aggre- 
gate of measurements. 7 Now eq. (20) may be transformed as follows. 
Let the eigenfunctions of P be denoted by so that P^x = Px\ h- On 
allowing P to operate on both sides of this equation, there results 
P 2 xp x = p\P\p\ = pfo. By continuing this process, the relation 

P r ix = Px'Px (11-22) 

is established. If the function <f> appearing in (20) is expanded in terms 
of the ^x, 

4 > = Z XiL- 

i 

and this series is substituted, we find 

p r = f T,o.*ajtiP r 'l'jdT = JjcfiajPj f ^fxpjdr 

ij ij 

= Z a*a t p r i 

i 

by virtue of (22) and the orthogonality of the i pi. Comparing this with 
(21) it is clear that 

XpiPi = E| ai \ 2 v r i 



347 


INTUITIVE MEANING OF A STATE FUNCTION 


for every integer r. But this can be true only if 

Pi = | a t - 1 2 (11-23) 

In words: when the system is in the state <j>, a measurement of the observa- 
ble corresponding to P will yield the value p t with a probability (relative 

frequency) | a t * | 2 , a* being the coefficient of in the expansion <f> = ]C®x^x? 

x 

and \[/\ is one of the eigenfunctions of P . The coefficients a* are called 
probability amplitudes. 

They may be expressed in terms of $ and \pi by the relation 

J *Udr = £ f tfa^dr = Oi ( 11 - 24 ) * 

Consequently, eq. (23) may also be written 

Pi = | J tUd-T | 2 (11-25) 

An interesting result is obtained when, in this equation, we let <t> be 
3ne of the eigenfunctions belonging to the operator P itself, e.g., \ pj. It 
then reads 

Pi = | j* 'I't'l'jdr | 2 = 8ij 

All relative frequencies are zero except the one measuring the occurrence 
>f the eigenvalue pj , which is unity. Thus we conclude that an eigenstate 
Pj of an operator P is a state in which the system yields with certainty the 
value pj when the observable corresponding to P is measured. Eigen- 
unctions are simply state functions of this determinate character. 

11.6. Intuitive Meaning of a State Function. — Consider now a system, 
ike a simple mass point with one degree of freedom, whose state function is 
p(x). We wish to know the probability that a measurement of its position 
vill give the value x = £. The eigenfunction corresponding to the opera- 
tor x- for the value £ has been shown to be 

ft = 5(s - £) 

Eq. (25) now reads 

P £ = I f «(* - 0<Hx)dz | 2 = | </>(£) | 2 (11-26) 

3y virtue of (13). The probability (density) of finding the system at £ is 


JLJL.O 


yiJAWTUM MxLiUiAAIN IUd 




Let qij q 2 ? - * •, q n be the coordinates on which 4> depends. Using the 
former arguments, the eigenfunction corresponding to the composite coordi- 
nate operator qvq 2 a * • • q n may be shown to be 

= Hqi ~ Si)KQ 2 -&)■•• «(< 7 » - *») ( 11 - 27 ) 

If, therefore, we wish to find the probability ... of finding the system 
at the point (£ 1^2 * * * £n) of configuration space, we must use eq. (25) with 
replaced by (27). Hence 


P& ••• £» 

I J J ■ ■ ■ J 5(ffi - &) • • • 5($ n - £ n )4>(qiq2 • 

= | $(£ 1&2 ’ ’ • £n) | 2 


■ q n )dqidq 2 • • • dq n 


11.7. Commuting Operators. — Let P and P be two operators satisfying 
the relation PP — PP = 0, and let their eigenfunctions be and x& 
that is 


Pix = Px'h, Rxim = (11-28) 

We assume the state function to be rpi so that, when P is measured, there 
results with certainty the value p z -. But 


RPipi = PRypi = PiRpi 

Considering only the last two members of this equation, we may say that 
{Rypi) is an eigenfunction of P, namely that belonging to the eigenvalue p*. 
But this is possible only if Rfa = const. Comparison with the second 
equation (28) shows the constant to be one of the r M , and fa to be one of the 
eigenfunctions Xu- We conclude that commuting operators have simul- 
taneous eigenstates; i.e., measurements on their observables yield definite 
values for both; they do not “ spread/ 7 

The fact that, when P and Q are non-commuting operators and the 
state of the system is an eigenstate of P, measurements on Q will give a 
statistical aggregate of values and not a single one with certainty, is usually 
attributed to the interference of measuring devices. For instance, the 
measurement of a particle's position disturbs its momentum, and vice 
versa, so that when one is ascertained with precision, the other quantity 
loses it. From this point of view, measurements on the observables asso- 
ciated with commuting operators are said to be compatible y the procedures 
of measurement do not conflict with each other. 

11.8. Uncertainty Relation. — The proof of the famous Heisenberg 
uncertainty principle which will now be given requires the use of an inequal- 

: + oimllrv, 1 


349 


UNCERTAINTY RELATION 


ns 


functions in the sense specified in connection with the definition of Hermi- 
tian operators (sec. 11.4), then 


J* u*udr • J v*vdr ^ j jf (u*v + v*u)dr (11-29) 


We assume a system to be in a state </>, which need not be an eigenstate 
of any particular operator, and we are interested in the results of measure- 
ments on the observables belonging to two operators, P and Q, at present 
unspecified. Introduce into eq. (29) the following functions 


u = (P ~ p)<f> and v = i(Q — q)<f> 


where p and q are mean values associated with P and Q through the rela- 
tion (14). Eq. (29) then reads 


f (P- p)*4>*(P - P)<pdr -f(Q- 5) V(Q - 9)<^r ^ 
i[* f (P- P)*4>*(Q - q)<pdr -if (Q- 4)V(P - p)4dr 


] 


2 


Now P and Q are Hermitian and satisfy eq. (15); p and q are constants. 
Therefore the inequality reduces to 


f 4>*(P - P) 2 <pdr • f 4>*(Q - qf^dr 3: f cj>*{PQ - QP^r 2 


(11-30) 


Let us consider the meaning of the quantity f <£*(P — p) 2 4xIt. When 

4> is expanded in eigenfunctions f \ of P, $ = and the expansion is 

x 

introduced in the integral, the result is £[ a\ | 2 (p x — P) 2 , and this, in view 


of eq. (23), is nothing other than the dispersion 8 of the statistical aggregate 
of p-measurements about their mean. For this quantity we may intro- 
duce the more familiar symbol Ap 2 . A similar identification is to be made 


for f <*>*(Q 
form 


q) 2 4>dr. Inequality (30) then takes the more interesting 


Ap 2 • A q 2 


-if/ 


<t>*(PQ - QP)*d- 




(11-31) 


QUANTUM MECHANICS 


113 

Now if P and Q commute, the right-hand side is zero, and it is 
Ap 2 or Ac/ 'to be zero, or even for both to vanish. This stal 
recalls the result of sec. 7, which was that both and q - me 
could yield single values without spread. 

When P and Q do not commute, relation (31) sets a lower 1: 
product of the dispersions, often called uncertainties. Si 
instance, that P is the operator — ih(d/dq ), the linear momen 
ated with q, and Q stands for the coordinate q. We then have 

PQ-QP = ih 

When this is put into (31) the result is A p 2 • A q 2 ^ ft 2 / 4, or, 
terms of standard deviations, bp and bq, 

ft 

*p * ^ g 

This is Heisenberg’s uncertainty relation. 

Our result need not be cast in the form of an inequality, 
quite possible to calculate both bp and bq separately and exact! 
state function 4> is given, as the postulates show. 

A slight generalization of the present conclusions is ah 
There are other operators, such as L z and 6 (cf. eq. 10 et seq.) 
obey eq. (32). In fact all quantities which are called canonic 
gate in classical physics 9 have operators which satisfy it. 
shall see that energy and time belong to this class.) For all 
uncertainty relation in the form (33) is valid. 

Problem. Show that, if the state function 0 is an eigenfunction o 
momentum operator L z corresponding to the eigenvalue l z , the product of 
at least as great as (h/2)l z . 

S CHRODIN GER EQUATIONS 

Attention will now be given to the eigenvalues and eigeni 
the energy operator, that is, to the solutions of the various f 
Schrodinger equations, eq. (12). 

11.9. Free Mass Point. — The simplest example of a phys 
is the free mass point for which the potential energy V may be 
zero. In that case eq. (12) reads 

vV + fcV = o 

provided we omit the subscript X and write k 2 = 2mE/fi 2 . T1 

Jr 2 W q rQf.Vio.r- QirrmlA plQssir'Ql si crn i fir. fl/no.p wVnf*h if, is wpII t.n r 


1 


FREE MASS POINT 


11.9 


ce. For if E is the total energy of the particle, which is in this case 
.rely kinetic, then E = \mo 2 = p 2 J2m. Hence k s p/h , p being the 
issical momentum of the particle. Note also that k has the dimension 
a reciprocal length. 

Eq. (34) has already been solved in Chapter 7 (cf. eq. 7-33), where it 
peared as the space form of the wave equation. To select the proper 
lution, we must consider the fundamental domain, r, of our problem, 
sre, a great number of possibilities present themselves. 

a. Enclosure is a Parallelepiped. If the particle is known to be within 
parallelepiped of side lengths h, h, and Z 3 , then r is this volume of space, 
oreover, since | ^(xyz) | 2 has already been identified as the probability 
finding the particle at the point x, y, z, this quantity must certainly be 
ro everywhere outside r. For reasons of continuity (which can, by more 
panded arguments, be shown to result from our axioms) we require that 
f | 2 , and hence \p itself, shall vanish on the boundaries of r also. In view 
this boundary condition, the solution of (34) in rectangular coordinates, 
rnely eq. 7-36, must be chosen. In more explicit form it reads 

= (. A x e iklX + 1 3 ie - iklx )(A 2 e ik ™ + B 2 e~ ik *')(A z e ik * z + B z e^ ikiZ ), 

^k\ + kl + k\ 

The origin of the parallelepiped may be taken in one comer. Vanishing 
yf/ at the boundary then requires: 

A s + B s = 0, A s e ik * 1 * + B 8 e~ ik * 1 * - 0, « = 1, 2, 3 

ie first condition makes each parenthesis of \p a sine-function; the second 
plies 


lere n« is an integer. Hence 

, . (n\ir \ . /n 2 7 r \ . /n S 7T \ 

ip = c sm xj sin Vj sin Z J 


(11-35) 


\% + l 2 J 


that 


If 4 / is to be normalized, J \p*\I/dxdydz = 1, and the constant c ha 

= ( 8 V /2 = /sv 
c _ \hhh) ~ w 


1/2 


The permitted energy values form a denumerably infinite s 
arrangement is best represented by constructing a lattice of pc 
all space, with the “ reciprocal ” parallelepiped of sides 1/Zi, 1/ 
crystallographic unit. If from a given point lines are drawn t< 
points, the squares of the lengths of these lines (multiplied by 
are the energies of our problem. However, not all these line? 
different states. The function \j/ changes only its sign when 
integers wi,7i 2 or ft 3 changes sign; it is not thereby converted ii 
linearly independent function. Hence only the lines lying in < 
of the lattice, with the origin of the lines at one corner, will 
different states. If some of the Vs are equal there will be degei 
sec. S.6), for then an interchange of the corresponding n' s will n 
a different E, while ^ will be changed into a function which is lin 
pendent from the original one. 


b. Enclosure is a Sphere. Eq. ( 34 ) must now be solved i 
coordinates. But this has already been done in sec. 8.4 (cf. 
for an acoustical problem. The eigenfunctions are, aside from * 
ing factor, \p = Y i(d,<p)7~ l 2 The permitted energies 

mined by the condition Ji+i/2(ka) = 0 where a is the radius 
closure. For any integer Z, there will be an infinite set of root 
which we shall label r^, n — 1,2,-- *, oc . The permitted Id s ar 


ki 


n 


Tin 

a 


and hence E , which will also depend on two indices (quantum 
is given by 


h 2 

2 md* 


Eln — 2 


The simple model treated here is called the “ infinite potential 
forms the basis for many nuclear (juantum mechanical calculat: 
one of the favored starting points for considerations leading 
shell structure.* A solution of the potential-hole problem 1 
wallsf requires the use of Bessel functions inside, Hankel fum 
side the hole. The sequence of the energy values is unalten 
levels are depressed. 


>3 


ONE-DIMENSIONAL BARRIER PROBLEMS 


11.10 


c. No Enclosure . When the particle is allowed to exist anywhere in 
>ace, the former boundary conditions need not be applied. The simplest 
ay to treat this case is to return to case (a) and permit Z*, Z 2 , and Z 3 to 
?come infinite. Let us first consider the eigenvalues. The lattice of 
>ints will condense as the Vs increase, until finally it forms a continuum; 
Le energy states (lengths of connecting lines squared) will also move closer 
id closer together until finally all (positive) energies are permitted. A 
nilar effect may be brought about by increasing the mass of the particle, 
a glance at eq. (36) will show. Quantum mechanics indicates no quanti- 
tion of the energy for particles which are not restricted in their motion, or 
hich have an infinite mass. 

What happens to the ^-function, (35), as the Z’s increase? Clearly, the 
jrmalizing constant c tends to zero, causing ip also to vanish. The mean- 
g of this is quite simple: As the space in which the mass point moves 
creases indefinitely, the chance of finding it at a given point, | \f/(x } y,z) | 2 , 
>proaches zero. The failure of the normalization rule is therefore not 
erely 'a mathematical phenomenon, but physically reasonable. To cir- 
mvent it, several procedures may be employed. One is to suppose that 
ere is an infinite number of particles in all space, N per unit volume, and 


/ 


cordingly to put I | j 2 dr, taken over a unit of volume, equal to N. 


ils leaves c finite. 10 

When there are no boundary conditions the ^-function need not be 
itten as a product of sines. In fact in the absence of an enclosure sine, 
sine and exponential functions are equally acceptable. Hence we may, 
we desire, write 

*e = c( W e*", E=^k 2 


mg the notation explained in connection with eq. (38) of Chaptei 7. 

Problem. Calculate eigenfunctions and eigenvalues of a free particle enclosed m 
ylinder of radius a and length d, obtaining 


ore aa is a root of J m 


E n 


h 2 /nV 


£("■ 


+ «' 




2m \ d 2 

11.10. One-Dimensional Barrier Problems, 
oblem the Schrodinger equation is 

J V , 2m rp 

Z5 + Ta 


-For a one-dimensional 


V(x)]* = 0 


11.10 


QUANTUM MECHANICS 


354 


Let us take V to be the step function given by the solid line in Fig. 1, 
that is: F = 0 if #< 0, F = F = constant if x > 0. The solutions for 
the two regions are easily written down : 

h = A t e ikix + Bie~ iklx , x<0 (left of 0) 

\p T = A r e %krX + B r e~ lkrX , x > 0 (right of 0) 

with 

_ V 2mE V2 m(E — V) 

ki = — - — and k T = 7 

ft n 

V = const. 


Fig. 11-1 

But how are they to be joined? The differential equation tells us that yp n 
suffers a finite discontinuity as we pass across the discontinuity in V. The 
increase in \p r in crossing the origin will be 

lim f \[/' f dx = lino. £ fy[ r + \p'/) = 0 
$—>0 J — $ e—K) 

Hence yf/ r (and a fortiori \f/) remains continuous at the origin. The con- 
stants A and B must therefore be fixed by requiring 

fc(0) = *r(0); (0) = i/' r (0) 

In addition to these two we have an equation expressing normalization, 
three relations in all. However, there are four constants (A h A r ,Bi>B r ) to 
be determined. The mathematical situation is therefore such that one of 
them may be chosen at will. Let us then put B r equal to zero. The physi- 
cal meaning of this will at once be clear. 

On applying the continuity conditions we have 


Ai + Bi = A r ; ki(Ai — Bi) = kfA r 



355 


ONE-DIMENSIONAL BARRIER PROBLEMS 


11.10 


The coefficients A and B have a simple significance. Let us analyze 
from our fundamental point of view a state function of the form 
^ = Ae %kx + Be~ tkx u In view of the third postulate (eq. 14/) it represents 
a mean momentum 

J 

p = —ih — 

/ ***& 


and a mean square momentum 


- -ft 2 



*i> u dx 



We have intentionally left the limits of integration indefinite. In evalu- 
ating the integrals occurring here we assume that the range of integration is 
very much larger than the wave length of the particles, 2ir/k. The inte- 
gral over the last two terms of — A* A + B*B + AB*e 2lkx + 
A*Be~ 2xkz will then vanish, and 


f Web = (1 A | 2 + j B \ 2 )l 


l being the range of integration. By a similar procedure, 

J W'dx = ik(\A\ 2 - | B 1 2 )l and J f*t"dx = — * 2 (| A | 2 + ) B j 2 )l 
Hence 

- 7i U I 2 - |s | 2 . .. s ;2i2 

V = fcfe j - 2 + ~j B 2 > whde P = k h 
It will also be observed that f is an eigenstate of the operator 

but not of —ift " . 

dx 

Translated into particle language, this state of affairs must be expressec 
as follows. Since all particles have a root mean square momentum o: 
magnitude fcft, and yet the mean momentum along x is smaller than kh 
some of them must be traveling to the right, others to the left, with momen 



11.10 


QUANTUM MECHANICS 


356 


whence 

» _ A _ IW, + 1\ _ L®1! 

« - V kh)/\ + kh) U i’ 

In our problem, P/a is the reflection coefficient of the barrier of potential 
energy V . In view of eq. (37) it is given by 

p _ 1 k t - fc r l 2 
\kl + k r I 2 

Two cases of interest may be distinguished, (a) E < V, (b) E > V. In 
classical mechanics, a particle would certainly be reflected in case a, 
(R = 1), certainly transmitted in case b, (R — 0). The matter is not 
quite so simple in quantum mechanics. In case a, k/ is real but k r is 
imaginary. R is thus always 1 in agreement with the classical prediction. 
But in case b both ki and k r are real, and R < 1 but not zero. Hence 
every potential barrier reflects particles, even though classically one would 
expect them to be only retarded. 

Before leaving this matter, we must justify the procedure of setting B r 
equal to zero. This is now seen to mean omission of a beam of particles 
travelling to the left in the region to the right of the origin. Had such a 
beam been included, the physical condition corresponding to p would have 
implied the incidence of two beams of particles upon the origin, one from 
the left and one from the right. In that case, p/a is not the reflection 
coefficient of the barrier. The ^-function we have chosen permits that 
interpretation, for it corresponds to one beam incident from the left, one 
reflected and one transmitted beam. 


Problem. Prove that p is the same whether it is computed to the left or to the right 
of the origin [use conditions (37)]. 


A study of more complicated barriers, such as that depicted in Fig. 2, 
reveals a new and striking feature: the “ tunnel effect.” The energy E 
of the incident particles is assumed to be greater than V\ and F 3 , but 
smaller than V 2 , so that from the classical point of view every particle 
would certainly be reflected. If we define 


k\ 




-k\ = 


^S(V 2 -E); 


ki 


2m 

lx 


^ (B - V») 


the ^-functions for the three regions are 

#i = A ie iklX + B ie - ik ' x , x < 0 



■57 


one-dimensional barrier problems 


11.10 


The continuity conditions for ip and y at both x = 0 and x = a are 
een to be: 

A\ Bi = A 2 + B 2 
iki(A\ — B\) = k (A 2 — B 2 ) 

A 2 e Ka + B 2 e~ Ka = A 3 e ik * a 
K(A 2 e Ka - B 2 e~ Ka ) = ik 3 A 3 e ik ' a 



a 


Fig. 11-2 

From tnese, B u A 2f and B 2 may be eliminated. When this is done we 
)btain the relation 

M = %A 3 e tk * a {(l + cosh Ka + i (~ — * a j (U-38) 

An argument similar to that which led us to identify the reflection 
soefficient R with | B \ 2 /\ A | 2 , shows the transmission coefficient of the 

Karripr Hp 


11.11 


QUANTUM MECHANICS 


358 


This may be computed from (38) . In doing so we assume that tea » 1 
so that both cosh tea and sinh m become §e“ a . Then 

m _ \§k\k 3 . - 2*0 

(fc, + k 3 ) 2 +(k- 

As the width of the barrier increases, the factor er 2Ka (sometimes called 
the “ transparency factor ”) rapidly diminishes. 

The surprising fact is that particles are able to “ tunnel ” through the 
barrier although their kinetic energy is not great enough to allow them to 
pass it. Classically speaking, the kinetic energy of a particle would be 



negative while it is in region 2. Quantum mechanically, this statement is 
devoid of meaning, since it is improper to compute E — V for this region 
alone. 11 

Fig. 3 gives a qualitative plot of the (real part of the) ^-function in 
the three regions here considered. It is seen that the barrier attenuates 
the wave coming from the left, permitting a fraction of its amplitude to 
pass out at a. The situation is quite analogous to the passage of a wave 
through an absorbing layer. 

11.11. Simple Harmonic Oscillator. — The potential energy, usually 
expressed in the form \kx 2 , is \mw 2 x 2 when written in terms of the mass m 
and the classical frequency co = 2rv of the oscillator. The meaning of co is 


oscillator to go back and forth oo/2tt times per second. The Schrodinger 
equation is 

j 2 t 

~2 + (« - 0 V)* = 0 (11-39) 

ax 

if we use the abbreviations 

2 mE moo 


The substitution £ = V#r reduces (39) to the form of the differential 
equation for “ Hermite’s orthogonal functions,” 




1 



0 


which was studied in Chapter 2 (cf. eq. 2-66). It was there found that its 
solution is of the form e~* / 2 H(£), H(£) being a solution of Hermite’s 
equation (2-62). Now H(£) is a polynomial if the quantity a, which corre- 
sponds to the present §(€//? — 1 ), is an integer. Unless this is true, H is a 
superposition of the infinite sequences (2-63) and (2-64). But both of 
these approach infinity like e* , as closer inspection will show. If they are 
multiplied by e~* /2 , they will not yield a ^-function which has an integra- 
ble square between the limits — °° and + °o 7 which we are here assuming 
to exist. Hence H(£) must be chosen in its polynomial form, #„(£). 
Also, — 1 ) = n, and this leads to 

E n - (» + i)fko = (n + i)hp (11-40) 

in = ce- W2)x 'H n (Vp x ) (11 41 ) 

If the oscillator has three degrees of freedom, the Schrodinger equation is 

vV + (« - jSVty = o 


when the same abbreviations as above are used. The method of separa- 
tion of variables (Chapter 7) which involves the substitution of X(x) • 
Y(y) • Z(z) for ^ at once reduces this partial differential equation to three 
ordinary ones 

X" + (*! - fx 2 )X = 0, Y" + (62 - PV)Y = 0 
z" + (63 - P 2 z 2 )Z = 0 

provided that «i + 62 + «s = «• Each of these has a solution of the form 
(41), so that 

in^n, = (Vfr) ■ H n ,(Vpy) ■ H nt {Vpz) (11-42) 


11.12 


QUANTUM MECHANICS 


3bl> 


and 


En x n 2 n 3 = (^1 + ^2 + '«3 + § )h<*3 


(11-43) 


The orthogonality of the functions (41) has been proved in oq. 3 1)2. 
From this formula, the normalizing constant c may also be computed. 
For if 


j c-r-^ Ill(\'px)d£ = r l/ - f refill 
= c 2 • 2"«! 


(?)# 


= 1 


then 


( n ! 2 " )” 1/2 


A similar computation, which involves three integrations, yields for the 
constant c of eq. (42) the value 


V\ 


( ti \ \ a L > ! a :i ! 2 n '+ n *+ n *)- 1/2 


Further mathematical details concerning the functions here encoun- 
tered, as well as a table of the //^-polynomials, are given in sec. 3.10. 


Problem. The treatment above implied that the ^-dimensional oscillator was 
isotropic; i.e., bound with equal forces in all directions. Calculate eigenvalues and 
eigenfunctions for an anisotropic oscillator with potential energy 

V = ^>iri(a 3 \x~ -f- -f- C032 2 ) 

11.12. Rigid Rotator, Eigenvalues and Eigenfunctions of L 2 . — A rigid 
rotator is a pair of point masses held together by a rigid, inflexible and 
inextensibie (massless) bond. A diatomic molecule is a fair approxima- 
tion to a rigid rotator. Before attempting to solve the Schrodinger equa- 
tion for such a system it is well to digress briefly and consider the eigen- 
value equation for an operator which so far we have not introduced, 
but which is easily constructed. We have seen that the operators corre- 
sponding to the components of angular momentum of a particle are 



(11-44) 



L‘ — L x + L% + Lz (11-45) 

is advantageous to do this in polar (spherical) coordinates.* Putting 
= r sin 0 cos <p, \j = r sin 0 sin <p, z = r cos 0, we have 


<3 d , 1 6 1 sm <p 6 

— = sm 0 cos v? 1 — cos 0 cos c? 

dx 6r r 66 r sm 0 dip 


6 . . 6 1 
— = sm 0 sin v? — 4 — cos i 
02/ dr r 


0 t 1 cos d 

sm p 

69 r sin 6 6<p 


d 6 

— = cos 6 — 
6z dr 


1 . d 
- sin Q — 
r 69 


hen these results are introduced in (44) and (45) is formed, there results 

L z = —h 2 (d- — (sin 6-^) + - 4- -h,} (11-46) 

[sm 6 66 \ 69/ sm" 6 dp" } 

he observable values which the square of the angular momentum may 
sume are the eigenvalues p of the equation 

L 2 * = (10-47) 


This equation is easily solved by the method of separation of variables 
f. Chapter 7). Clearly, $ is a function of 6 and <p. Put \J/ = 0(0) • <$>(*?) 
to (47). This equation will then break up into two ordinary equations 
he process is analogous to the construction of eqs. 7-42a and 7-42b) : 


ft 2 - 


1 6 _ 
sin 0 66 


(sin 6Q r ) 


m 


sin 2 0 



= 0 


= — m 2 4> 


The quantity m must be an integer to insure single-valuedness of <£. 
ie second equation therefore has the solution <£ = const. e im<p , m an 
teger. The first is the equation for associated Legendre functions, 
sq. 7.42b), except that the constant 1(1 + 1) appearing there is here 
placed by p/ft 2 . The solution previously obtained is 

d m 

0 ~‘ in ’" e dl^W- F,(cose) 

ow the Legendre function Pi was shown to behave singularly at 
>s 0 = ±1 unless l is an integer, in fact it would contain unlimited powers 
x(= cos 0). The same would be true for 0 if l were arbitrary. But in 


See also the problem at the end of this section. 


11.12 


QUANTUM MECHANICS 


362 


that case 



which contains the factor 



0 2 sin Odd 



would certainly not exist. We conclude, therefore, that l must be an 
integer, and that the eigenvalues of L 2 are 

v = i(i+ m 2 

On the other hand, the eigenfunctions of L 2 are of the form 
d m 

sin m 9 — Pi (cos OW™* = FT (cos d)e im<p (11-48) 

d6 


in the notation adopted in Chapter 3 (cf. eq. 3-43). Since the eigenvalue p 
does not depend on w but only on l, functions like (48) with different m 
will satisfy eq. (47). The most general solution of that equation is there- 
fore, 12 

i 

f = L c m PT (cos 8)e im<p (11-49) 

m= —l 

In Chapter 7 this function has already been encountered; it is called a 
spherical harmonic and denoted by Yi(0,<p) (cf. eq. 7-43 et seq.). Hence 

* = Y i (0,<p) (11-50) 

Since dr = sin 6ddd<?, normalization requires that 

J r* if r* ^ 

sin Odd I difnp*ip — 1 
o «do 


When (49) is inserted the integral becomes 

2, 5 [F?(x)\ 2 dx - ~~ 5 I C. I 2 

(cf. eq. 3-62). Hence, for normalization, the constants c m appearing in 
(49) must satisfy the relation 

‘ , , 2 = 2! + 1 

«--i 1 Cm 1 (« - m)! 4 tt 


and are otherwise arbitrary. 

We are now ready to return to the problem of the rigid rotator. In 
the first place, we shall assume it proper to replace it by a single mass, 

A'i ir q nonior nf rnt.si f.i rm «nrl hfl.vinff i. lie co m o r» 


363 


MOTION IN A CENTRAL FIELD 


11.: 


as the original system. The condition upon the state function in aceo 
with this assumption — aside from singie-valuedness — is simply r = 
a constant. The best procedure is therefore to write down the Schr 
dinger equation for a particle moving in three dimensions, and then 
put r = a, d\p/dr = 0. This requires the use of polar (spherical) cooiv 
nates. The potential energy V, in this case, is clearly constant and rru 
be taken to be zero. 

Schrodinger’s equation reads* (cf. Chapter 5 for transformation of V 


11 ( r 2 W\ 

r 2 dr \ dr) 


+ 


1 d 
r 2 sin 9 d6 



+ 


1 d 2 ^ 


2M 

r* sin^ 6 d<p z ' K 


+ = 0 


( 11-5 


When r is put equal to a the first term on the left vanishes, and t) 
remainder becomes very similar to L 2 \f/. Indeed if we introduce, a nc 
operator A 2 defined as (l/ft 2 )L 2 , eq. (51) may be written 


A 2 \[/ = 


2 Ma 2 


E\p 


( 1 1 — 5 : 


But the eigenvalues of A 2 are obviously 1(1 + 1), and its eigenfun 
tions are the same as those of L 2 . The constant (2 Ma 2 /h 2 )E, must I 
identified with 1(1 +1). Hence the eigenvalues and eigenfunctions are 

E = ~ *(* + 1) J = Fz(0,*>) (1 1-5: 


Problem. Show by vector algebra that 

-A 2 m (r X V) 2 = — r 2 V 2 +2 r — + r 2 ~ 

dr dr 1 

Hint: Note that (r X V) 2 = r • [V X (r X V)]. Then use (4-26) for V X U X M 

11.13. Motion in a Central Field. — By central field is meant a field 
force in which the potential energy is a function of r only; V is independe 
of d and (p. The isotropic three-dimensional oscillator treated in see. 
is an example of motion in a central field. Another is the motion of 
particle in a Coulomb field. It is to this last example, an electron attracts 
by a positive point charge (hydrogen atom), that we shall chiefly direct c> 
attention. But before considering this specific case a few general featur 
of the central field problem will be exposed. 

It is now clear that the Laplacian, V 2 , in spherical polar coordinat 
has the form 


11.13 


QUANTUM MECHANICS 


364 


where A 2 is given by (46) divided by — h 2 . The eigenvalues of A 2 are 
1(1+1). The Schrodinger equation therefore reads 


1 f d ( , d A 

r 2 [dr\ dr) 



2m 

+ [& 
n 


V(r)M = 0 


(11-55) 


We write ^ as a product of a function R(r ) and another, A (6#), which 
depends only on the angles. The operator A 2 acts only on A. Eq. (55), 
after multiplication by r 2 and subsequent division by R • A, has the form 


d 

dr 



R 


2 mr 2 r _ . . 

+ -jr IE - V (r)] = 


A 2 A 


A 


(11-56) 


The left-hand side of this equation is a function of r alone, the right a 
function of 6 and <p. By the argument which is familiar from Chapter 7, 
each side must be a constant, say a. Thus 

A 2 A = aA 


But this is simply the eigenvalue equation for A 2 . We see, then, that 
a = 1(1 + 1), and A = Yi(6,<p) 

The left-hand side of (56) becomes 


d ’ 2 dR~ 
dr L dr _ 


and the substitution U(r) 


U" + 


2mr 2 T Tr/ x 1 ( 1 + 1) >( 

+ -r [ E - VH - -w- *' 

rR(r) reduces this to 

1(1 + 1 )h 2 


' R = 0 


2m 


E - 


F(r) - 


L±-^> = o 

2 mr 2 


(ll-57a) 


(ll-57b) 


The development so far has been totally independent of the form of V, 
except in assuming it to be a function of r alone. The results obtained are 
therefore valid for any central field. Summarizing them, we may say : 
The energy states of a particle in a central field are always of the form 

* = - Ui(r)Yi(6,<p) 


and the function Ui is determined by eq. (57b). It was necessary to add a 
subscript l to U because the differential equation contains l as a parame- 
ter. The energies E are obtained solely from eq. (57b). 

That equation looks very much like the one-dimensional Schrodinger 
equation, 


MOTION IN A CENTRAL FIELD 


11.13 


t with the term 1(1 + l)h 2 /2mr 2 added to the normal potential energy, 
lat is the meaning of that term? In classical mechanics, the energy of a 
:ticle moving in three dimensions differs from that of a one-dimensional 
:ticle by the kinetic energy of rotation, |mr 2 w 2 . This is precisely the 
mtity l (l + l)h 2 /2mr 2 , for we have seen that l (l + l)h 2 is the certain 
ue of the square of the angular momentum for the state Y t , in classical 
guage (mr 2 w) 2 , which when divided by 2 mr 2 , gives exactly the kinetic 
irgy of rotation. 

There is, however, one further difference between (57b) and (58). 
e fundamental range of r in (57b) starts at r = 0 and is limited to 
sitive values, whereas the range of x in (58) may include negative values, 
is fact often has a more important effect on the eigenvalues than the 
iition of the terms just mentioned. 

Let us now solve eq. (57b), assuming a Coulomb field, e.g., V (r) = — e 2 /r. 
e energies E will then be the energy levels of the hydrogen atom. n For 
ficiently large r the solution is determined by 


>vided we define 




(11-59) 


(11-60) 


The solution of (59) is U*> = cie (a/2)r + C 26 "" (a/2)r , and this represents 
i behavior of the correct U at oo . Let us first suppose that a. is real, 
tich means that the energy of the particle is negative. U will then cer- 
nly not have an integrable square (note that the radial integral has the 


m 


f RV* - f VHr) if the coefficient c\ fails to vanish. But we 


mot simply put it equal to zero because we have boundary conditions to 
fill! Without going further in our analysis at the moment we expect, 
before, that only special values of a will produce acceptable solutions 
en a is real. If the total energy of the particle is negative (classically 
making, the particle is bound to the attracting center), the energy is 
:>ected to be quantized. The following analysis will bear this out. 

If a is imaginary, which means that E is positive, U m shows sinusoidal 
lavior. It has, in fact, the typical form of the state function for a free 
rticle, and the failure of normalization occurs in the milder manner 
ich we have previously found associated with the presence of a continu- 
? snectrum of eigenvalues. There is indeed no wav of choosiner Ci or co 


11.13 


QUANTUM MECHANICS 


366 


or a which would make one U<® more acceptable than another. We con- 
clude that, when E is positive, the energy spectrum is continuous. 

From the point of view of classical physics this result is welcome, for 
when E is positive the particle is ionised and moves through- space, its 
energy being unrestricted. 

We now discuss the bound states in a more rigorous manner. Put 
E = —W, so that W is positive. Our interest will now return to eq. (57a) 
which forms a more suitable basis for the present discussion. Let r = x/a, 
where a is defined by (66). Eq. (57a) then reads, after some cancellation, 


d 2 R dR ”2 me 2 x 

X dx 2 dx h 2 a 4 


1(1 + 1 ) " 
x 


R = 6 


(11-61) 


But this is precisely the differential equation for associated Laguerre 
functions, which was studied in Chapter 2 (cf. eq. 71). For our immediate 
purpose we shall write that equation with n* in place of n, since otherwise 
our notation would be in conflict with physical convention. To summarize 
the results of sec. 2.16: 

The equation 

xy" + 2 y' + £n* - V = 0 (11-62) 

has a solution possessing an integrable square 14 of the form 

y = e~ x/2 x (k - 1)/2 L k *(x) (11-63) 


provided n* and k are positive integers . Moreover, n* — k > 0 since 
otherwise L k * would vanish. 

On comparing (61) and (62) we find, in the first place, that (k 2 — l)/4 
= i(l + 1), hence 


Secondly, 


k = 21 + 1 



= n* — l = 


2 me 2 


When the value of a is inserted here and the relation is solved for W, 
we find 


W = l 


me 


2 (n* - l) 2 h 2 


Because of the conditions on n* and k; the quantity n* — l cannot be 
zero. It is usually denoted by n and called the total quantum number 

(off f.ViiA rAI a it. r»lc»A 7 Arl in +I10 "Rnltr Onr PAn^lnoinn +V>c»t> ic iKio • 


MOTION IN A CENTRAL FIELD 


11.13 


*gy states of the hydrogen atom are 

W = ~~E = - — — - (11-64) 

n n 2 n 2 h 2 1 

corresponding eigenfunctions are, in accordance with (63), 

R n ,i = c n> ie~ x/2 x l Ll l $i(x) (11-65) 

ihle x being defined by 


x = otr — 


r = —2 r 
nh 


ohr theory of hydrogen, the first orbit has a radius 


Oq = — ^ = 6.53 X 10 8 cm. 
me~ 


netimes convenient to express x in terms of it. Thus a = 2 / na ( » , 


x = - - (11-66) 

n a 0 

to be noticed that x represents a different variable for each energy 
he quantum number n determining W appears as a scale factor in 
msionless variable x. 

e integrals involving R n tt , which occur frequently in physical and 
1 problems, have been evaluated in sec. 3.11. See also the example 
nd of sec. 3.11, which is of interest in this connection, 
later use, we write down in explicit form the state function for the 
hydrogen atom. It is 

Ri .0 = Ci.o e~ r/a \L\ = 2ao~ 3/2 e~ r/a ° 

this state Yi = constant = (4ir)~ 1/2 when the function is nor- 
Hence the total ground state function is 

= (7r^)~ 1/2 e”^° (11-67) 

Lons for the higher states are listed in explicit form in Pauling and 

15 

m the charge on the nucleus is not e but Ze, oo must be replaced by 
> that 

/ y3 \ 1/2 

*o = (— j) e- Zr/o ° (ll-67a) 


11.14 


QUANTUM MECHANICS 




Problem a. Using the results of Chapter 3, show that the normalizing factor in 
(65) is 

2 Y /2 f (n - l - 1)!] 1/2 


«n.l = {■ 


jiao) 


[2nl(n + l)\}\ 


Problem b. Work out the problem of the isotropic oscillator using spherical co- 
ordinates, and show that the results agree with those obtained in (42) and (43). 

11.14. Symmetrical Top. — In dealing with the problem, of a rotating 
xigid body attention must be given to the kinetic energy operator. To 
obtain it we first observe that its form in rectangular coordinates, for the 
n particle problem (cf. sec. 11.31) is 


>2 n y 2 

Tt-- ~X—* 

2 * =*i rtii 


The position of a rigid body is best expressed in terms of the Eulerian 
angles, introduced in sec. 9.5. It was there shown that the classical 
kinetic energy is given by 

T c = \im &. ? + 2 /? + 2 ?) 

1=1 

= ^Afi 2 + \ Ac? sin 2 fi + iC(y + a cos fi) 2 

Let us define a line element constructed from the Cartesian coordinates 
& = Vm&i , m = Vmiyi y ft = V 'm&i 

as follows: 

ds 2 = £ (dg + dg + dg) (11-68) 

i—1 

This is clearly identical with 2T c dt 2 . From the form of T c in Eulerian 
coordinates it is seen that ds 2 in these coordinates is given by 

ds 2 = Adfi 2 + A sin 2 fide? + C(dy + cos fida) 2 (11-69) 

Now the quantum mechanical form of T is the Laplacian operator corre- 
sponding to the line element ds 2 , multiplied by —ft 2 / 2. The problem is 
therefore to transform the Laplacian operator from a set of coordinates in 
terms of which the line element is given by (68), to a new set in terms of 
which the line element is (69). 

This problem has been discussed in sec. 5.17. If 

ds 2 = E g^dqxdqp 


73 


SIMPLE HARMONIC OSCILLATOR 


11.16 


Suppose that the matrices P and Q, which satisfy (71a) and make H 
liagonal, have already been found. Then 

1 


H, 


kl 


2 m 


(P 2 + m 2 u 2 Q 2 ) kl = E k b kl 


provided we write E k for the diagonal elements of H. 

Now let A = P — imojQ 

,nd 

B = P + im^Q. 

rhen, because of eq. (71a), 

AB — 2 m H + moohl 
,nd 

BA — 2 mH — moofi! 

\ T ow form ABA from (75) and (76): 

A (2m H — nioofil) = (2 mH + mcohl)A 
(E\hj — \uhh\j) = J^(E/ch\ T ^ho k \)A\j 

X A 

Akj(Ej — Ek “ &h) = 0 . 


(11-74) 


(11-75) 

(11-76) 


ience A k j vanishes unless 

• Ej — E k = 

sText, form 5,45 from (75) and (76): 

B(2mH + moihl) = (2mH — mwhl)B 

J2B k \(E\8\j + = 2 (Ek&k\ — ^uhd}c\)B\j 

x x 

B }c j(Ej ~ E k + wft) = 0 or, by changing the subscripts, 
Bj k (E k — Ej + c oft) = 0 
Hence Bj k vanishes unless 

Ej — E k = cofi. 

'Tow take a diagonal element of eq. (76) : 

(BA),-,- = 2771(5,- - (11-77) 

lut 

(PA),-,- = J^Bj k A k j. 

k 

Sach term in the summation over k vanishes except the one for which 


Or there is no eigenvalue which is less than Ej by ha. Then the right side 
of (77) is zero and 

Ej — 

This must be the lowest eigenvalue. From this analysis we may conclude 
that the sequence of eigenvalues is 

-hu, Trha), etc. 

in agreement with the results of sec. 11. 

11.17. Equivalence of Operator and Matrix Methods. — We first estab- 
lish a theorem of great importance in quantum mechanics. Consider a 
differential, Hermitian operator L of the kind discussed in sec. 4, which 
generates, through the eigenvalue equation 

Efa ~ lifa 

a complete set of orthonormal functions fa. Whether fa is a function of 
one or many coordinates is unimportant in this connection. If we intro- 
duce other operators M , N which act on the same variables as L we can 
clearly form two square arrays of numbers, i.e., matrices, by the rule: 

Mij = J* cj)fM(pjdr y Nij = J <t> fNfadr (11-78) 


dr being the element of configuration space of the variables of fa The 
theorem asserts that equations which hold between the operators M and N , 
also hold between the matrices formed by the rule (78). To prove this it is 
necessary only to establish this parallelism for the two fundamental opera- 
tions, addition and multiplication: 

(M + N)ij = Mij + N{j (11-79) 

(MN)ij = TtMixNxj (11-80) 

x 

The first of these is at once evident from (78). To prove the second, 
let us expand the function Nfa in terms of the fa themselves: 

Nfa = (11-81) 


By the general procedure of finding the expansion coefficients, 16 


faj = 



(11-82) 


The left side of (80) is, by definition, 


j tfMNfadr. 


On using (81) 


10 Multiply the equation by <j> } and integrate, using the orthogonality of the fa. 



) , UJJLiO \JLTjL I W« J.TJL jL~,U/k j%U\Uj i | J.VJL f — 

J X X t/ 

xiVx;, in accord with eq. (80). 

, then, we wish to form matrices satisfying relations like (71) or (71a), 
eed only find operators which conform to them, select an ortho- 
al set of functions <f> and construct the matrices by means of the rule 

*obiem a. The operators Q = and P — ~he~ ix (d/dx) satisfy 
PQ — QP — —ihl 


le functions <t>k ~ 


1 

V%r 


e tkz 7 k — 0, ±1, =fc2, • • • to construct the matrices Pkh 


They will be found to be identical with those given in eq. (73). 
roblem b. Construct the matrices X nm and P nm , using X — x, P = —ih{d/dx), 
aking as the orthonormal set the normalized Hermite orthogonal functions dis- 
1 in Chapter 3. Note that n and m can only be 0 or positive. 

ns. Xnm = V(n + H/2^5 m .n+i + 1; 

Pnm = — rn)X nm . [0 is defined after eq. (39).] 

low that these matrices satisfy (71a). 


, is interesting to note here that a Hermitian operator, defined by 
15), generates a Hermitian matrix (cf. sec. 11.10). For 

J tfPtjdr = J 4>jP*<$dT 


ly means 


Pa = n 


ir present notation. 

'he success of Heisenberg's directions is now easily understood. The 
* ential operators which obey relations analogous to those prescribed 
leisenberg's matrices (71) are 

^ t% ■> d 

Q,m " Qmj * m a 

her words precisely the former, Schrodinger operators. 17 Suppose we 
t an orthonormal set of functions, <t>i y belonging to the operator L, and 
fcruct 

( Qm)ij “ ^ 4>i ©tC. 

f The fact that there are also others, like the ones considered in problem a, heed 
listurb us here. The Schrodinger equation which results when they are used 
us different, to be sure, but reduces to its familiar form when a change of variable 
ie. 


When these matrices are substituted into the functional form H the result 
is the same as if we had at once formed 

Hij = J <t>*Hfadr 

as follows from the theorem we have proved. But the only condition 
under which this matrix can be diagonal is 

H(j>j = const. tf>j (11-83) 

that is to say, the faf unctions must be chosen to be eigenfunctions of the 
Hamiltonian H. The problem of making the matrix H diagonal is equiva- 
lent to selecting the proper fa, i.e., to solving the Schrodinger equation. 
To see that the diagonal elements of H are the permissible energies Ei of 
the former theory, we need only substitute H<j>j = Ej<f>j into (83), obtaining 

Hij — Efaj 

It is easy to extend the Heisenberg theory beyond the limits of the 
present development. The second postulate, eq. (8), is valid if P is 
interpreted as a matrix and fa as a vector . In the terminology of Chapter 
10, the fa are then the eigenvectors of the matrix P, and the p\ are its 
eigenvalues. The relation of the eigenvectors to the state functions is not 
difficult to see. Suppose we choose a basic orthonormal set of functions, 
fa. Expand the eigenfunction fa appearing in the operator equation 

Pfa = Wfo (11-84) 

in terms of them, viz., fa = 

x 

Now multiply (84) by <pf and integrate. We find immediately 

!t2P j\&i\ = pi®ij 

X 

and conclude that the eigenvector \f/ x has as components the coefficients 
which appear in its expansion in terms of the basic <i>. More explicitly, 


an 

0>i2 



The last equation then reads {P^i)j = If the basic set is identical 

with the eigenfunctions of the operator P , the eigenvector has only one 



SYMMETRICAL TOP 


11.14 


On identifying the from (69) we find (putting q x = 0, q 2 = a, 
= y) 

/A 0 0 \ 

0 A sin 2 /3 + C cos 2 (3 C cos /3 J 


(flV) 


VO 


hence 


(9^) = 


C cos /3 


1 


.4 sin 2 j 3 
cos j8 


A sin 2 



p = 4 2 C sin 2 /S 


When these results are substituted in the expression for we have 
h 2 


n = 


vfy = 


h 2 

\ d li 

( sin $ d<A 

2 sin 0 


K~ W 


+ 


+ 


3 I" sin (8 
da L-4 sin 2 


0 d± 

da 


sin |S cos 0 
A sin 2 0 


dy I 

h 2 \d^P 


sin 0 cos 0 
A sin 2 0 


/si 

da + V 


sin 0 sin 0 cos 2 
A sin 2 0 


: )*] 


dip 


2 A [d/3 : 


2 + COt 13 d/3 + 


1 djp 
sin 2 /3 da 2 

2 ., 


i / ,9 -d\ 2 cos /3 dfy ] 


Since the potential energy in this problem is zero, the Schrodinger 
ation becomes 

T\p = Exp 


[t is separable; for if we put 

\p = u(a) • 0 ( 7 ) • w( 0 ) 

functions u and v are seen to satisfy equations of the form 

d 2 u du 7 d 2 v 7 dv 

0,2 ~J ~2 “b a i 3 b = 0, 62 + ^13 b bo v = 0 

da da dy dy 

re the coefficients an, a\, ao are not functions of a , and the coefficients 


m and k being roots of algebraic quadratic equations involving tl 
cients a and b. However, these need not be solved here, since 
dition of single-valuedness dictates that m and k be integers. We i 
put 


u = e iMa , v = e iK -* 


M,K = 0, ±1, ±2, etc. 


The Schrodinger equation now reduces to the following 
differential equation in the independent variable 0 : 


w" + cot 0w r 

r m 2 ( 
“ + V 


cot 2 0 + 


i) 


K 2 


„ cos/3 2A 

2 — v-- KM - - t E 
sin 2 /3 h 2 


The substitutions 


\ (1 — cos 0) = x 
wifi) = x |/c - M|/2 (l - x) iK+MU2 F(x ) 


which are suggested when this equation is examined for its singi 
along the lines of Chapter 2, transform it to 


d 2 F 


O 2 - x) ^ + [(1 + p)z 


,dF 


?] ~jr ~ n(p + n)F = 0 


the new parameters being defined as follows: 

p=l + \K-M\ + \K + M\ 
q = 1 + | K - M | 

/2 E K 2 \ 

n(P + n)-A - — J + K 2 - *(p - l ) 2 - *( p - i) 
This last relation, when rearranged, may be written 

Reference to Chapter 2, eq. 56 will show at once that the diffe 
equation for F is none other than the familiar hypergeometric eq 
defining the Jacobi polynomials, provided n is an integer. Unle 
condition is satisfied, F will diverge for x = 1, i.e., for p =* x . 

Eq. (70) takes a simpler form when we introduce the new qu 
number 


V- 1 


L 


GENERAL REMARKS AND PROCEDURE 


11.16 


ich is evidently a positive integer or zero. We then obtain 

*-sO + ‘> + (?- 1 )*'] 

equation which determines the energy levels of the symmetrical top. 
>te that the quantity §| K — M | + §| K + M | is equal to the larger 
the two integers K and M; in consequence of this neither j K | nor j M | 
a be greater than J. 

The energy levels of the spherical top (A = C) are those already 
tamed in sec. 11.12 (cf. eq. 11-53). 


MATRIX MECHANICS 

11.15. General Remarks and Procedure. — The formulation of quantum 
ichanics we have given in the foregoing sections was historically preceded 
• Heisenberg's matrix theory. The latter, while it appears at first 
mce to be an altogether different mathematical structure, strikingly 
oduced the same results as the former. But when the initial amazement 
bsided both formulations were recognized as equivalent. In the present 
set the Schrodinger-Dirac theory was discussed first because its axioms 
3 m perhaps less strange, and because its point of view has been more 
dely adopted. The terminology of matrix mechanics, however, enjoys 
eat popularity and is often conducive to clarity of expression. 

It is possible, and perhaps pedagogically worth while, to derive Heisen- 
rg’s theory from the postulates of part of this chapter. But when this 
done, the impressive element of uniqueness which attaches to matrix 
eehanics is completely lost. To preserve it we proceed to state the basic 
zte of the theory first, to give an example of its application, and then to 
hibit its relation to the preceding developments. We can afford to be 
ief, for when the equivalence of the two theories is once ^established, no 
w insight is likely to be gained by deducing former results over again in a 
Bferent manner. As before, attention will be limited to what we have 
lied quantum statics. The principal facts of Chapter 10 will be used. 

Heisenberg associates with every observable a square Hermitian matrix . 
j in the Schrodinger theory, one of the chief concerns of matrix mechanics 
the determination of the measurable values of an observable. Let it be 
sired to find the observable values of a quantity H y which, classi- 
fy, is a function of the Cartesian coordinates qi and momenta pt, 

tj f ^ Am* avomnla tito oVtall anApifv f.r» Ko 


11.16 


QUANTUM MECHANICS 


372 


Find a set of matrices Qu Q%, — , Qn] Pi, Ps,' ",Pn which (a) satisfy 
the commutation rules 

QmQn QnQm ~ ® j PmP m m ~ PmQn ~ QnPnt 8=1 ihZ nm K 

(11-71) 

where E is the unit matrix; (b) render the matrix 

H(Q l • • • Q n ; Pi • • • P n ) diagonal (11-72) 

By H(Qi • • • Qn] P\ • • • F n ) is meant, of course, the matrix which is 
the same function of the matrices Q\ • • * P n that the ordinary function H 
is of qi * • • p n . The existence of the matrix H and its uniqueness will be 
assumed. When such a set of matrices has been found, the diagonal 
elements of H will he the measurable values in question . (It is also true that 
the squares of the absolute values of the elements (Qi)\p are simply related 
to spectroscopic transition probabilities, as will be shown later; but this 
does not concern us here.) We illustrate the power of the method by an 
example. 

11.16. Simple Harmonic Oscillator. — The Hamiltonian function is 
(cf. sec. 11) 

= 2 “ + Wfl* 

Hence, if P and Q are matrices, 

M(Q,P) = ~(P 2 + mVQ 3 ) 

2m 

The straightforward way of working tills problem would be to select a 
set of matrices such as, e.g., 

^ ij P\f& ~ |-i (11 —73) 

which satisfy the commutation rule (71): 

(FQ)x^ - (QF)xm - -iWx* (ll-71a) 

as the reader may verify. These must then be subjected to a similarity 
transformation with some other matrix, say S, until the new matrices 

Q f = S~ l QS, P r = ST 1 PS 

when substituted in H , make H a diagonal matrix. (Cf. Chap. 10.) This 
procedure, however, is usually very cumbersome and is rarely used. The 
success of the matrix method depends frequently on fortunate guesses or 

A * _ a!. . TT * ! j . * T i 1 . * j jl . 


The correct solution of eq. (90) is certainly not of this exact form 
because of the “ interaction term ” e 2 /ri 2 , whose effect on \p one would 
xpect to be very complicated indeed. Aside from other changes, it will 
ause 0 to depend on r 12 explicitly. But from a physical point of view, the 
epulsion between the electrons will cause both of them to be, on the aver- 
age, farther away from the nucleus than if the repulsion were absent. • This 
vould mean that the functions u and v are in error with respect to the scale 
actor Z/oq. If this were smaller, a more extended probability distribu- 
ion would result. (For the helium ion Z = 2.) It would seem expedient, 
herefore, that we take as our “ trial ” function in the variational procedure 
he function <f> = u(l)v(2) but with an undetermined Z. 

In calculating 

f <f>H<t>dr (11-91) 


b is well to have available the differential equation whose solutions are u 
nd v: 


2m 


Vfu(l) 


— m(1) + Z 2 E h u(1), v(2) = u(2) (11-92) 


lere Eh is the energy of the normal hydrogen atom, E H = — e 2 /2oo 
= —13.53 e. volts). The differential dr in (91) represents, of course, the 
product of the volume element for the two electrons. When H is taken 
rom (90), we find, using (92) and the fact that u is normalized, * 


f 4>H4>dr = 2 Z 2 E h + (Z- 2 )e 2 J Q- + ^4> 2 dr + e 2 J ~-dr (11-93) 


Che integral 

f — dr = f dry ■ f u 2 (2 )dr 2 = f • r\dr x sin 6ddd<p 

Jr i Jr i J J r\ 


3 easily computed directly. It has, in fact, already been evaluated (cf. 

/ ^2 

— dr, 
7*2 


las the same value. We leave the evaluation of e 2 
r alue is —%ZE Hence, eq. (93) becomes 


/ 


0 

— dr for later; 

r\2 


its 


/ 


<j>H<t>dT 


<? 5 

2 Z 2 Eh + (Z - 2) • 2Z -ZE b 

ao 4 

Z [2 Z - 4 (Z — 2) — f] E h 


11.19 


quantum mechanics 


382 


This expression is to be made as small as possible by choosing Z properly, 
i.e., the coefficient must take its maximum value because E a < 0. Putting 
the derivative with respect to Z equal to zero, we find for the minimizing Z 
the value 27/16, which is somewhat less than 2 as we expected. Hence 
the best energy value attainable by adjusting Z in our function is 
2(27/4 - 2 Z)Eb = 5.695Eb- The energy found experimentally is 
5.807 Eg- The difference between these two values is to be ascribed to the 
defects of the simple trial function here chosen. 



Fig. 11-4 


A very interesting summary of the results of the present method as 
applied to helium is given by Pauling and Wilson. 20 Their table shows how 
the value of the integral approaches the experimental energy as increasingly 
refined trial functions are used. 

To complete the analysis we indicate how the integral 

I = e>f£ d r 

J n 2 

may be computed. The method is typical of the evaluation of “ double 
volume ” integrals involving the variable ri 2 , and hence perhaps of some 
interest. The volume element 

dr = r\dr l sin ■ r|dr 2 sin MMps 

may also be expressed as follows: (see Fig. 4) 


THE METHOD OF LINEAR VARIATION FUNCTIONS 


11.20 


w r| = rj + rf 2 — 2riri 2 cos whence r 2 dr 9i = r 5 ri 2 sin fcbp provided 
Hid t\ 2 are held fixed. By means of this relation sin may be elimi- 
,ed from the last expression for dr, and we obtain 

dr = r\dT\r 2 dr 2 r\ 2 dr X2 sin diddicLpidx 

Dstitute this volume element into I, and integrate at once over the 
jles, thus introducing the factors 2 • 2ir • 2t. On using the abbrevia- 
l a — 2Z/ao, we obtain 

1 “ "IT III e~° :(ri " fr2) r l dr l r 2 dr 2 dri2 


The ranges of integration are: 0 < r x < qo , 0 < r 2 < ; | r 2 — r t j < 

< r x + r 2 . The absolute value sign on the limit for r 12 forces us to split 
integration over r 2 into two parts, (a) r 2 >r lt (b) r 2 < r\. In range (a) 
lower limit of r 12 is r 2 — r x , in case (b) it is n — r 2 . Thus 


X' 




/»r 2 -fri 

Tidri 

1 e ~ ari r 2 dr 2 

f 

t 

'n «- 

/ r 2 ~n 



f* T l H~ r 2 

T2dr 2 

1 e~ aTl ridri 

f 

« 

J r 2 ^ 

' ri -r 2 


Inspection shows that the two triple integrals are equal. The cafcula- 
i is now perfectly straightforward; it makes use of the formula 


f e^ px x 1i dx = p” (n+1) n! 

«Jn 


leads to the result 


5 e 2 

I = jfZ- 

8 Oo 




ch was used above. 

11.20. The Method of Linear Variation Functions. — It is often con- 


tent to use as the trial function <j> in J 4>*H4>cIt a linear combination of 


nite functions Ui which are judged suitable for the problem at hand, 
j coefficients appearing in the linear combination may then be treated as 
table parameters Thus, assume 


<t> = Z d\U\ 

X = 1 


(11-94) 


sre the u’s need not form an orthonormal set. We define 


f ^H'hd.r 


11.20 


QUANTUM MECHANICS 


384 


The symbol tKT*y in place of H y is to remind the reader of the fact that the 
matrix 3C does not possess the simple properties of H because the former is 
not constructed with an orthonormal, complete- set of functions. The 
denominator in the expression for E is needed to normalize the function <j>. 
According to the variational principle, E > E 0f the lowest energy state 
of the system. 

We wish to find the condition that E shall be a minimum, and the mini- 
mal value of E . Insertion of (94) gives 

This expression will be an extremum, and we hope a minimum, if E is so 
adjusted that dE/da * and dE/da & are zero for every k from 1 to n. Let us 
take the derivative with respect to a* on both sides of the last equation 
after it is written in the form 

E I (11-95) 

Xm Xm 

The result is 

dE 

T - * A\n + E J2a^kn - n, k = 1,2, •• -,n 

VUk Xm mm 

When the first term is omitted (dE/da* = 0) the remainder of the 
equation represents the condition that E shall be a minimum. Differentia- 
tion of (95) with respect to a* leads in a similar way to 

E ]£ a* Ax k = 

x x 


an equation which is simply the conjugate of the former. Both may con 
veniently be written 

ZXPOfeM ~ Aj k JE) -0, k = 1, 2, • • •, n (11-96) 


If this system of equations is to have a solution different from the trivial 
one: every a ^ — 0, then the determinant constructed from the coefficients 
of the c*m must vanish.- Thus 


3Cn — Ahj?? 3C 12 — A 12 E 
— A 21 E 5K*22 — A 22 E 


— A \ n E 

3^2 n ~ A 2nE 


**0 


(11-97) 


I — A»ii? 0C n2 — A n 2 E • • • 3C nn — A nn E I 

This is an equation of the n-th degree in E and therefore has n roots. The 
lowest of these will be an approximation to the lowest energy of the system. 



377 


VARIATIONAL (RITZ) METHOD 


11.18 


berg theory if its form is suitably changed. We interpret 0 as a vector § 
with components a*, the a* being the coefficients in the expansion of the 

function 0 - in terms of our basic fa (0 without subscript here 

x 

denotes an arbitrary state function, not necessarily one of the set fa), but 
0* not as the complex conjugate, but the associate vector: 

<j> f = (a*a*a* • • •) 


P represents the matrix Pij = 



modified to 


p = tfP$ 


Eq. (14)* must then be 


which reads, when written more explicitly, 

V = ’LafPify 

X/i 


When the fa are taken to be the eigenstates of the operator P, the 

matrix P becomes diagonal, and p — £a*axPx, which is the same relation 

x 

as was found in the Schrodinger theory under these conditions. 


Problem. Calculate the integral 

J x r e~ I 'H n (x)H m (x)dx 

by the methods of matrix mechanics. Let fa — Cne^^Hnlx), fa * Cne-^H^x), 
where c n , Cm are normalizing factors, and note that, aside from normalizing factors, the 
integral is the matrix element (z r ) nw . Now is given by eq. (3-93); this may be used 
in calculating 

OOnm 35 X X n \X\^.X^v * * * %ot>i 

’ * * « 


APPROXIMATION METHODS FOR SOLVING EIGENVALUE PROBLEMS 


11.18. Variational (Ritz) Method —In Chapter 8 we showed that the 
differentialequation L(u) +• \wu — ( pu') r — qu + X wu = 0 is the neces- 
sary (though not sufficient!) condition upon u if it is to minimize the inte- 


gral A(u) = J* (pu f2 + qu 2 )dx . 


Furthermore, it was seen that A(u) could 


be transformed (cf. eq. 8-37) by simple steps to 


-/ 


— I uL(u)dx. 


The 


theory in this simple form is applicable to every one-dimensional 
Schrodinger equation, for in that case the Hamiltonian operator 
H = ~(h 2 /2m)(d 2 /dx 2 ) + V(a:)is of the form — L if only we identify p 
with h 2 /2m and q with V. Hence we may at once say that the Schrodinger 



11.18 


QUANTUM MECHANICS 


378 


equation is the necessary condition upon p so that the integral 

J pHpdr 

shall be a minimum. The one-dimensional variation theory may also be 
applied, though in a somewhat more cumbersome manner, to every ordi- 
nary differential equation to which the multi-dimensional Schrodinger 
equation gives rise on separation of variables. It is possible, however, to 
prove a far more general theorem which is of utmost utility in numerous 
problems of applied mathematics, a theorem of which the former statement 
is a special case. 

Let P be a Hermitian operator. We wish to find the normalized func- 
tion \f/ which will make the integral 

^Pidr 

a minimum. The integration extends, as usual, over configuration space, 
and we shall assume for the sake of definiteness that r is a finite portion of 
configuration space. Certainly, the necessary condition upon \f/ is that 

S ^*P^dr — A J* 

shall vanish; X is an undetermined (Lagrangian) multiplier (cf. sec. 6.5). 
Now the variation symbol and the integral sign are commutable in this 
expression because the limits of the integration are supposed finite and 
fixed. Hence we have 




/V • Ppdr + Jp* -8(Pp)dr - X Jbi*-<pdr~\ J P*8pdr = 0 (11-85) 


The second integral in this expression may be transformed in two steps. 
First, 8(P\l ') may be replaced by P($^) since the operator P suffers no varia- 
tion. Second, because P is Hermitian and both \p and are acceptable 


functions, 


J rpmdr = j 


8\P ■ P*p*dr. 


Eq. (85) therefore reads 


J Sp*(Pp — \p)dr + J' 8p(P*p* — \p*)dr = 0 (11-86) 


Here 8p is an entirely arbitrary function. Let us take it to be real, so that 
dp* — 8p. Eq. (86) can then be satisfied only if 

Pp~\p + P*p* - X** = 0 



379 


VARIATIONAL (RITZ) METHOD 


11.18 


On the other hand, if we take 8$ to be imaginary, so that 
we conclude 


P$ - ~ PV 4* X^* as 0 




Addition of the last two equations yields 


subtraction gives 


P$ *= 

P V = ty* 


We have shown that, if 



(11-87) 


for normalized \p y this function must satisfy the eigenvalue equation 

Pf = (11-88) 


which also automatically determines X. Whether, when (88) is satisfied, 
the minimum of, or indeed the integral, J* actually exists, is a 


point we have not investigated. It is customary in physics not to worry 
about these eventualities, for they are difficult to discuss. The mathe- 
matical equivalence of the minimal property of the integral and eq. (88) is 
usually taken as a matter of faith. 

If ^satisfies eq. (88), then f ^*P\Mr = X. From what has been said 


it follows, therefore, that the integral J <p*P<pdT computed with a function 

different from the minimizing cannot be smaller than X. But here a 
slight complication arises, for there are many eigenvalues X. All that we 
can really say is that for a function <p in the “ neighborhood ” of fa, the 
integral will not be greater than Certainly, however, 



r ^ Xo 


(11-89) 


if <p is any analytic and continuous function 18 and Xo the lowest eigenvalue. 

The Ritz method, 10 named after its inventor, is a systematic procedure, 
based upon the foregoing variational considerations, for solving the eigen- 
value equation (88) by substituting into the integral in (87) a suitable 
sequence of functions which causes the integral to converge upon the 
value X. Instead of presenting the method in its original form, we shall 

18 Restriction to functions with a certain number of derivatives is necessary becaus 
P is in general a differential operator, and P<p must have meaning. 

19 Ritz, W\, J.f. reine und angew. Math . 136, I (1909); Courant- Hilbert, p. 160, 



11.19 


QUANTUM MECHANICS 


380 


here work out some of its features in a manner more directly adapted to 
the needs of quantum mechanics, and with a slight loss of rigor. We are 
usually interested in finding the energies, particularly the lowest (normal 
state) energy of physical or chemical systems, hence we identify at once 
the operator P in (89) with the Hamiltonian H. 

The simplest way of finding an approximation to the lowest energy of a 
system is to use (89) directly. Sometimes a good guess can be made as to 
the general form of the true state function a form which may allow the 
inclusion of one or more arbitrary parameters. The integral in (89) is 
then computed with this function, and the result is minimized with respect 
to the parameters. An example will clarify the method. 

11.19. Example: Normal State of the Helium Atom. — The helium 
atom consists of two electrons moving in the field of a nucleus of charge 2e 
and at the same time repelling each other. We consider the nucleus as 
stationary and denote the distances of the two electrons from it by n 
and r 2 respectively; r i2 is the interelectronic distance. The potential 
energy is -2e 2 (l/ri + l/r 2 ) + c 2 /r 12 , and the Schrodinger equation 


h H-£ <v!+v - ) -Ht + £) + 


e 2 

rii, 


* = E& (11-90) 


A subscript on the symbol V indicates that the Laplacian is to be taken 
with respect to the coordinates labeled by the subscript. If the term 
e 2 /ri 2 were absent eq. (90) would be separable, for then the operator H 
would be the sum of two helium-ion Hamiltonians, H = Hi + H 2 , the 
first acting on the coordinates of electron 1, the second on those of elec- 
tron 2. But the equation 

(Hi + H 2 )t = m 


may be separated on substitution of \f/ = ii(l)i>(2), where u( I) stands for a 
function of the space coordinates of electron 1, and v(2) is defined similarly. 
For it becomes, after division by 


ffm(l) H 2 v( 2) 

u( 1) + v(2) 


a constant 


which indicates that Hiu{\) = EiU(l); H 2 v(2) = E 2 v(2); E\ + E 2 = E. 
But the first two of these are simply Schrodinger equations for the singly 
charged helium ion, whose solutions we already know. (Cf . eq. 67a.) Since 
we wish to find the lowest energy of our system, we identify the functions as 
fellows: 

and \p is the product of these. 



385 the hydrogen molecular ion 11.21 

Trhe other roots approximate, though in general much more poorly, to the 
— 1 higher states of the system. 

11.21. Example: The Hydrogen Molecular Ion Problem— The 

consists of two positive charges -f-e, which we shall consider station- 
ary and a distance R apart, and one electron whose distances from the 
protons will be denoted by ta and See Fig. 5. 

The Hamiltonian operator is 



Fig. 11-5 


If the terms e 2 /R — c 2 / r R were missing, H would be the Hamiltonian of a 
hydrogen atom with its proton at A, whose normal state function is (cf. 
eq. 67) 

u A = 

On the other hand, if the terms e 2 /R — n 2 /r^ were missing, the normal state 
function would be 

ub = {ral)~ l/2 e- r * /a ° 

'From a physical point of view one of these solutions is as good as the other: 
t ia implies that the electron is entirely attached to proton A, ub that it is 
attached to proton B. Neither is the case. Let us see what happens if 
we take for the variation function <t> a linear combination’ of u A and ub- 
We put 

<t> — UaUa + 0,bUb 

using letters as subscripts rather than the number indices which appear in 
(94). 21 The lowest energy is at once obtained as the lowest root of (97) 
which takes the simple form 

$aa — &aaE *Xab — A abB 

= 0 (11-98) 

*Xba — &baE 3C bb — A b$E 

21 In more complicated molecules it is well to label electrons by numbers, nuclei by 
letters. We here follow this convention. 






11.21 


QUANTUM MECHANICS 


386 


Now Aaa = / u*ua<1t = Abb = 1, because ua and un are normalized. 

They are not orthogonal, but A A b = Aba- Similarly, OCab = I uaHubcIt 

= tK ba and Hbb = #A 4 since # is insensitive to an interchange of A and 
B> With these simplifications the two roots of (98) are found to be 


Ei « 


*Xaa +3Cab 


E 2 — 


ctT e-tf 

J^AA —JV 


AB 


(11-99) 


1 + &AB * 1 — A ab 

The 0-functions corresponding to these energies are obtained from (96): 
&A A A — E) + (IbCKaB ~ A abE) — 0 

guPCba ~ &baE) + (XbO^bb ~ E) = 0 
On inserting E = i?i we get = ub ; hence the corresponding 

4 >i = ci(ua + Ub) 

If is to be normalized, ci = [2(1 + A ab)]~ 1/2 - If E 2 is inserted in (99), 
we find ag = — so that 

02 = c 2 (ua ub) 

The normalizing factor is in this case C 2 = [2(1 — A ab)]“~ 1/2 . 

The remainder of the work is the computation of .the three quantities 
A ab, ^aa and DCab- It involves nothing new and will be left to the 
reader. The integrals are most easily evaluated in spheroidal coordinates 
(cf. eq. 5-40). £ = (ta + tb)/R, v = (xa ~ ^b)/R and <p, the latter 
measur ed around R. In terms of these 

7? 3 

dr = — (£ 2 — t? 2 )d£di?d|p, and UaUb — (irao)“' 1 e’’ c * /ao)€ 

8 

1 < f < so; -1 < t, < 1 
The following results will be found : 


Aab = e“' (l + p + j) 


3Caa — Eh + ~ + J } where 


ua — = - e —[l-e 2<, (1 # + p)] 

Tb ft 


* (^Eh + Aab + X, where 




Wa — Usdr 

ta 


— 6 P (p + p 2 ) 


(11' 100) 



PERTURBATION THEORY 


11.22 


^87 


The parameter p = R/oq) E h is defined as in sec. 19. The quanti- 
ties J and K are of interest. According to its definition, J represents the 
Coulomb attraction energy between a negative charge of density u\ 
a.nd the proton B. The integral K has no such simple interpretation; it is 
Called an exchange integral. Its importance is best appreciated if E\ and 
^2 are written more explicitly with the use of (100) : 

Ei „ Eh + ‘1 + 1±JL 

1 " T R T 1 + i A , 

E ‘~ Eh+ ‘r + 1-^ b 

Because K is negative, E\ is the lower root. Had we omitted the func- 
tion ub from our trial function <£, the variational result would have been 


E — Eh + ~ + J 


JE\ is lower than this by virtue of the presence of K (and of course A ab ). 
But in classical parlance, a lower energy must be regarded as due to the 
presence of additional attractive forces between the constituents of the 
system, i.e., a hydrogen atom and a proton. These forces would be 
l^iven by dK/dR ; they are commonly called exchange forces . They 
possess no classical interpretation ; their significance is rooted entirely in the 
variational method through which they arise. 

Of course E\ is only an approximation to the true energy, which is 
lower for every R? Its most important feature is that it possesses a 
i minimum , which explains the stability of the H$ ion. Classical mechanics 
would yield no minimum and is therefore incompetent to account for the 
existence of this ion. A detailed comparison of E x with the experimental 
energy is given in Pauling and Wilson. 22 

Problem. Let w 0 , v>u u 2 he the three lowest energy states of the simple harmonic 
oscillator, Hq its Hamiltonian. The Hamiltonian for an oscillator in an electric field is 
U = Ho + kx, where A: is a constant. Calculate by the variational method the lowest 
energy of this system, using as trial functions (a) uq, (b) ao^o + ^i^i, (c) aouo -h aiw-i + 
clwi. 

Ans. (a) \hv, (b) hv - V k 2 xh + j(M 2 ~ h hv ~ fc2 *oi/K (c) \hv - hvk 2 x 0l / 
[ (hv ) 2 - k 2 x i 2 ] (approximately). Here *<,• is defined as J ' mxujdr, as usual. 

11.22. Perturbation Theory.— The following problem is frequently met 
in quantum mechanics. We know the energy states of a given system, say 
an atom, and also its eigenfunctions. A small perturbation, such as an 


22 Pauling and Wilson, loc. cit. 



11.22 


QUANTUM MECHANICS 


388 


electric or magnetic field, is now imposed; this changes, presumably by 
slight amounts, both energies and state functions. Mathematically* the 
situation is described in this way. We know the solutions and eigenvalues 
of 

H% = Efoi (11-101) 

where H° is the “ unperturbed ” Hamiltonian. We wish to find solutions 
and eigenvalues of 

H4h = Erfi, H ~ H° + H' (11-102) 

U r being considered as a “ small ” addition to H°. (By a small operator 
we mean one whose matrix elements, formed with the functions fa, are all 
small compared with the diagonal elements of H ° .) 

To solve the problem we use the method of linear variation functions, 
using as our trial function 

<t> = ZXtf'x (11-103) 

x 

If we allow an infinite number of terms in this summation and choose 
the coefficients properly, we expect </> to be the correct solution of (102), for 
the \f/\ of (101) form a complete set. But since the fa are orthonormal, the 
energies are given as the roots of (97) with every replaced by a 
Kronecker 5;/, so that E appears only in the principal diagonal. More- 
over, 

3C« = = (fl°)< y + H'j = E%j + H'a 

H'i = 

Hence the determinant reads 

H' n — (E—Ei) H[ 2 H[ 3 H' u 

H' 21 H' 22 -(E-E° 2 ) H’ 2 3 Hi 4 

H' 3l H'z2 H' 33 -(E~E°z) HL ••• _ 

Hi i Hit Hi 3 Hit-{E-Efly-- ~ U 


(11-104) 

If all its roots could actually be found they would indeed be the exact 
energies of our problem. But in the case we are visualizing certain simpli- 
fying approximations are in order. Suppose we are interested in the 
energy Ey, that is, the energy to which 2?$ is changed by the perturbation. 
(Ei need not be the lowest energy of our system, for the states may be 
labeled in an arbitrary order.) If Eft is a non-degenerate level, then Ey 



389 


PERTURBATION THEORY 


11.22 



will lie much closer to E? than to any other unperturbed E? . This suggests 
the following approximations: 

a. Put E = Ei in all diagonal elements except the first. 

b. Since every difference E? - E? for i ^ 1 is large compared to #';, 
the latter may be omitted in all diagonal elements except the first. 

c. Neglect all non-diagonal elements except those in the first row and 
the first column, since they affect E i only in a secondary way. 

When this is done, the determinant reads (we now write AEi for the 
perturbation E — El we are seeking) 

H’u-AEy H[ 2 H{ 3 #{ 4 ••• 

#2i JSS-E? 0 0 ••• 

#31 o E 3 °-E? 0 • • • =0 (11-105) 

Hii 0 0 El -#?••• 


It may be evaluated by the usual process of adding multiples of rows or 
columns. In this instance, multiply the second row by # 12 / (E 2 — E?), 
and then subtract it from the first. The element H[ 2 will then disappear 
from the first row, but the first element is converted into 


# n - A E x 


#12#21 
TpO jpQ 

Ei2 — & 1 


Next, multiply the third row by H[ 3 /(El — £?) and subtract it from 
the first. The result will be disappearance of H[ 3 and addition of 
— H'izHsi/ (E l — Ef{) to the first element. This process is continued until 
all non-diagonal elements of the first row have disappeared. We now have 

( 'h'u - AEi - ijr— 0 ) (4 - E?)(JBS - E?) • • • = 0 

If E? is non-degenerate, as we are supposing, none of the parentheses 
except the first can be zero. We therefore conclude 

AEi = H[{ ~ £ (H-106) 

and this is the Rayleigh-Schrodinger perturbation formula. The quantity 
H'u is often called th efirst-Grder perturbation, the sum on the right is called 
the second-order perturbation. By retaining more elements in (104) third 
and higher orders may be computed, but these are rarely used. When the 
approximation (106) is not sufficient it is generally preferable to return to 
the variation scheme, or to find a more successful way of evaluating the 
determinant (104). 



11.22 


QUANTUM MECHANICS 


390 


Formula (106) may, of course, be used to calculate the perturbation in 
any energy level which is non-degenerate; to show this fact it may be 
written in the form 

A E k = H' kk - £' (H-107) 

where we have also used the Hermitian property of #*x* The prime on the 
summation symbol indicates that the term in which X = k should be 
omitted. 

Next, let us find the coefficients a\ in (103). They are obtained from 
(96) which now reads 

X , a n(Ek8kn + H'kp. ~ ESkn) ”0, k — 1, 2, • • • 


In accordance with the approximations which led to eq. (106) we put 
E = JS? and neglect every H kfl unless one of the subscripts is 1. We then 
find 

aiH f 21 + a 2 (E% - tf?) = 0 if k = 2 
a x H r 31 + a s (E% - E° x ) = 0 if k = 3, etc. 


Hence 


= 


H'x i 

ttjO tttO 
~ 


X 7^ 1 


or in general, if we are interested not in E i but in 2?*, 


a\ 


zr° ak: 


X ^ k 


(11-108) 


The coefficient a*, must be chosen so that <j> is normalized. Since all other 
a\ are small, its value is very nearly unity and may be taken as such. 

Formulas (107) and (108) have been derived by assuming that the level, 
k , whose perturbation is being calculated, was non-degenerate. For 
degenerate levels both formulas obviously fail, for they contain terms with 
vanishing denominators (several Ef( being equal to El), To deal with the 
case of degeneracy we have to return to the fundamental determinant 
(104). If the functions u\, u 2 , • • •, u n all belong to the same energy j E? (we 
then say that the level Ei has an n-fold degeneracy), these functions are 
equally concerned in the perturbation, and if we formerly retained all 
matrix elements of the form H[\, we must now retain H 2 \ , H' 3 \, • • •, H f n \ 
also. But for most purposes sufficient accuracy results if we neglect all 
elements connecting a state of the degenerate group with all states not 



391 


NON— DEGENERATE CASE. THE STARK EFFECT 


11.23 


belonging to that group. Eq. (104) reduces in this case to 


Mil - 

AE H[ 2 

HU •• 

• M[ n 


Mil 

H 2 2 ~~ AE 

HU •• 

■ H' 2n 


Hi, 

HU HU — AE • • 

• HU 

= 0 (11-109) 

HU 

HU 

HU •• 

• HU - AE 



the n roots of this equation (of which some may coincide) are the energies 
into which E? will “ split ” as the result of the perturbation. They can- 
not, of course, be represented by a general formula. 

These energies are said to represent the first-order perturbation. If 
greater accuracy is desired the work may be continued in this way. By 
substituting the first-order energies into eqs. (96) and neglecting all states 
not belonging to the degenerate group, n sets of coefficients a if a 2) • • •, a n 
are found, each set belonging to a single first-order energy. This yields 
n functions 

n 

Vi = L 
X = l 


If now we construct matrix elements with the ^-functions, Hij = 

J'v*HvjdT , these will be diagonal ; for solving (109) is the well-known 

procedure for diagonalizing the matrix H f . (See Chapter 10.) Hence, 
when the ^-functions are chosen to represent the n degenerate states, the 
second order perturbation can be computed by formula (107), from which 
the terms with vanishing denominator are now absent because every 
corresponding to them is zero. 

11.23. Example: Non-Degenerate Case. The Stark Effect. — Let 

H° represent the Hamiltonian operator for any one-electron system, and 
let yj/\, yp 2 , be its eigenfunctions. When a uniform electric field along X is 
applied, the term H f = —eFx is added to H°, e being the electronic charge 
and F the field strength. The normal state of the system 'is non-degener- 
ate, hence formula (107) may be used. Denoting the normal state by the 
subscript zero, we find 

AE, = -eFx oo - e 2 F z £' (11-110) 

Here x 0 x = J The first term on the right is usually zero because 

| \po | 2 is an even function of x ; thus the “ first-order Stark effect ” is absent. 

In classical physics, the increment in energy of an atom due to a static 
electric field is expressed in terms of the polarizability a in the form 

AE « -|aE 2 



11.24 


QUANTUM MECHANICS 


392 


On comparing this with (110) we find for the polarizability of the normal 
state of our system 


« = 2e 2 r 


Zox 


** 77t0 77(0 

X H ~ JtL 0 


For an oscillator, this takes a particularly simple form, since all x 0 \ 
vanish with the exception of x 01 = VT/2/3 (cf. Chapter 3, eqs. 92 and 93). 
Also, E* = (X + \)hv. Thus 


« = 

hv 


Comparison with the problem of sec. 21 shows that second-order per- 
turbation theory gives in this instance the same result as the variational 
method with the trial function a^o + a^i. In general, however, the 
use of a simple variation function yields a much poorer result for the 
polarizability than the method of sec. 22. 

11.24. Example: Degenerate Case. The Normal Zeeman Effect. — 
The energy states of the hydrogen atom were found to be 

R n ,i ( r ) Yi(9,<p) 

To a given l, there belong 21 + 1 spherical harmonics of the form 
v i - E c m PT (cos d)e im<p , (Pr m « FT) 

m= —l 

and each such combination with its own set of coefficients c m , forms a proper 
eigenfunction when multiplied by R n ,b The energy does not depend on m; 
the state under consideration has therefore a (21 -f l)-fold degeneracy. 

Let us choose the 21+1 functions in the simplest possible way, namely 
by letting each Yi contain only one term, as follows: 

R na ■ c^Pfe-^, ■ ■ ■, R ni • 

and label them fa, * * *, in that order. 

The Zeeman effect is the splitting of the energy levels of an atom in a 
magnetic field. When a uniform magnetic field along the Z-axis and of 
strength F is applied to the hydrogen atom, its unperturbed Hamiltonian 
takes on the extra term 23 

Tjl _ gig, T? ± ; • A 3 

2Mc F d<p A 3<p 


Each matrix element H[j = J contains the factor J’ R 2 ,ir 2 dr 


23 See Van Vleck, J. H., “ The Theory of Electric and Magnetic Susceptibilities,” 
Oxford, 1932. We write here M for the electron mass to avoid conflict with the summa- 
tion index m (magnetic quantum number). Note that H f is the quantum representation 
of (e/2M c )F * L, where L is the angular momentum veotor of sec. 11.3. 



393 


GENERAL CONSIDERATIONS 


11.25 


which, by virtue of the normalization of the radial functions, is unity. If 

• d 

e ll<p — e -*( 1 -Uv d<p, 

o d<p 

and this vanishes In the same way all other non-diagonal matrix ele- 
ments are seen to be zero. The diagonal element H f n is 


—iA • 



0*<Mr = 


-Al 


and the others are similarly constructed. 

When these elements are substituted into (109) we have 


-IA - AE 0 0 0 

0 — (Z — 1)A — AE 0 0 

0 0 - (Z - 2) A - AE 0 

0 

0 0 0 IA - AE 


= 0 


The determinant is already diagonal, our choice of functions was a 
fortunate one. The perturbed energies are clearly 


„ , heF 

AE - mA = ™ = -l, 


-l+l, •••0, 1---Z 


Classically, an electron in a magnetic field F performs a uniform preces- 
sion of angular frequency co L = eF/2Mc , known as the Larmor frequency . 
Thus we see that AE = mho) L . 


Problem. Calculate the Stark effect of the rigid rotator (of. sec. 11.12), for the 
state l — 3, adopting the same choice for the spherical harmonics as above. Here 
H f = —eaF cos 6, provided the electric field F is along Z. The determinant will not bo 
diagonal. To calculate the matrix elements, use formulas (3-48 and 53). Include in 
your calculation successively more states: l — 2, 3, 4; Z — 1,2, 3, 4, 5. 

TIME-DEPENDENT STATES. SCHRODINGER’S TIME EQUATION 

11.25. General Considerations. — In all preceding considerations we 
have assumed that the states of the systems in question were stationary 
ones, that the time coordinate could be disregarded in describing them. In 
generalizing the theory so as to make it applicable to states which change in 
time it is well to look back and see why a time-free description was possible 
thus far. 

It is important to note that the time, t, in classical mechanics is canoni- 
cally conjugate to the energy, E, in the same sense that x is conjugate to 
p x . Let us then for the moment consider the operator P x = — ih(d/dx). 
Its eigenstates were seen to be (cf. eq. 9) \pp ~ What do they tell 

us about the distribution of the system in x? The answer is, it is uniform. 



11.25 


QUANTUM MECHANICS 


394 


Whatever is true at the point x x , is also true at the point This is the 
meaning of the uncertainty principle applied to the case at hand : if the 
momentum is known with certainty, the state function is entirely non- 
committal with regard to x . If in the calculation of the mean value of an 
operator Q, 

Q = J* 'PpQ'l'pdr 

Q did not depend on x, we could have afforded to neglect the factor e i%/fC>vx of 
\J/ P altogether. It had to be included, however, because most operators of 
interest do depend on x . 

But this trivial situation existed with regard to the time coordinate in 
all the Schrodinger problems considered heretofore. The states were those 
in which the energy was known with certainty, and for this reason the state 
functions were completely indiscriminate in respect to i. What was true 
at t x was also true at t 2 . Moreover, the other operators used were inde- 
pendent of L This condition will always be present as long as we are deal- 
ing with closed systems, for the energy will then be constant in time. 

When the system is an open one, the present method must clearly fail. 
But the last remarks contain the hint that we should, perhaps, associate 
with E the operator — iti(d/dt)* This would lead to the eigenvalue 
equation 

^ = Eyp 

which is certainly too simple because the energy depends on other things 
beside the time. The example above gives us no definite lead at this point 
because p x does possess the single dependence on x . There is, however, 
only one reasonable way to include these other variables, namely, to put 
them into E, which thereby ceases to be an eigenvalue: E must be replaced 
by the Hamiltonian operator II. We then arrive at Schrodinger 3 s time 
equation 

= Hip ( 11 - 111 ) 

01 

H is to be constructed as before by replacement of every Cartesian 
coordinate p; by —ih(d/dqi) and the dependence on t is to be introduced 
explicitly. 

It is immaterial, of course, whether we choose eq. (Ill) or its complex 
conjugate equation. The latter choice has certain advantages and will 
here be made. Furthermore, we shall use the symbol u (more or less 
generally) for time-dependent state functions and thus record Schrodinger’s 



395 


GENERAL CONSIDERATIONS 


11.26 


time equation in the form 



( 11 - 112 ) 

This equation, being of the first order in t, permits prediction of the state u 
at any future (or past) time when u is known as a function of the coordi- 
nates at present. Although it is closely related to the preceding develop- 
ments, eq. (112) is a new postulate not derivable from those already given. 

The present theory must be valid also in the special case when H does 
not contain t. When that is true eq. (112) is separable. On writing 
u = \p(qi * * • q n ) • fit) it becomes equivalent to the equation 

df 

Hyp dt 

T - ** 7 

f J 


each side of which must represent a constant. But in view of the form of 
the left-hand side, that constant must be one of the eigenvalues of the 
operator H, say E\, so that 

df _ —iE\ 
dt h J 

Hence 


f\ = ce 


- {iEyJh)t 


The general solution of eq. (112) for the special case in which H is inde- 
pendent of the time is 

u = T.<vhe~ (iE ^ h)t (11-113) 


We have formerly said that any state function, such as u , could be. 
expanded in the orthonormal system of functions \p\. This expansion was 
written as 

U = T,a\'b. 

x 

We now see that this is indeed true even when the analysis is made on 
the basis of eq. (112), but the coefficients a\ are always functions of the 
time: a\ = cxe~ c<Ex/,l) '. The mean value of E, computed for the state 
(113), is 

» - S| <fc l^x = 21 | 2 ^x 

It is independent of t. But the probability of finding the system at the 
point q\ • • • q-n of configuration space, u*u = is a 

superposition of oscillating functions of the time. The only way for this 



11.26 


QUANTUM MECHANICS 


396 


time dependence to be obliterated would be to have c\ = hj in (113), in 
which case 

U*U = 

Thus, whenever a state is formed by superposition of energy eigenstates, 
the mean energy of the system remains constant, but the configuration of 
the system changes in time. The reader should note, of course, that the 
solution of the Schrodinger equation (12) when multiplied by e ~~ itEx/h)t is , 

also a solution of (112), but that the solution of (112) does not in general 
satisfy (12). 

Problem. Let the time-dependent Hamiltonian be H — Ho + V(£), where Ho 
acts only on space coordinates and has eigenfunctions ^ x , eigenvalues E x . Show that 

u = £ c^ x e~W m+fVdt) 

11.26. The Free Particle; Wave Packets. — The eigenfunctions of the 
energy of a free mass point (cf. sec. 11.9) moving in one dimension without 

ft 2 

restriction are \pk = e xkx > its energies E k = -— k 2 , and there is no quantiza- 

2m 

tion. The general solution of eq. (112) for the free particle is therefore, 

f c(ky [kx ~ W2m)kH] dk (11-114) | 

a function constructed after the manner of (113) but with an integral i 

instead of a sum. An integral very similar to this has been already j 

encountered in the mathematical formulation of waves (cf. eq. 7-38) and 
of diffusion phenomena (eq. 7-53). It is interesting to inquire what form [ 

u will have at some time t if at t = 0 it is given by u = m o 0c). The 
coefficient c(k) may be determined by Fourier analysis. We have 

uq = J c(k)e tkx dk 

whence by eq. 8-13 

Eq, (114) therefore reads 

u( x ,t)=£f f Mt)e ilk *- () - W2m)kh] dtdk 

In this instance, the integration over fe cannot be performed (as it 
could in the diffusion problem, sec. 7.14). To proceed further it is neces- 
sary to introduce the function uq explicitly. 



<S97 


THE FREE PARTfCLEJ WAVE PACKETS 


11.26 


Assume that uq — e x ^ a Then, with the use of the formula 


we find 



(11-115) 


Hence 


c(k) - 



e ~«W/2 



e -fc2[aV2+i(V2m)iH-tfcx^ 



(11-116) 


again with the aid of (115). 

Eq. (114) represents a superposition of waves of wave length 2ir/k and 
frequency v — ( fi/A.irm)k 2 . The form of here chosen describes a concen- 
t-ration of waves about the origin, a phenomenon called a “ wave packet.” 
Such a wave packet does not retain its spatial distribution; eq. (116) is 
characteristic of the manner in which it diffuses. 

From the point of view of quantum mechanics, 1 % is the probability 
density of the particle at t = 0. It represents a Gauss error function of 
tc width ” a. At time t , 


u*u — 


1 + (—2 
L \ma 2 / J 


- 1/2 


exp — 


or -{- 


m 2 a 2 


The probability density is still a Gauss function, but of smaller maximum 
and of width fa 2 + ( h 2 /m 2 a 2 )t 2 ] l/ 2 . 

Problem a. Compute how long it would take an electron, localized within 
cl — 10 -l ° cm., to diffuse through twice that distance. 

b. How long would it take an object weighing one gram, localized within 1 cm., to 
diffuse through twice that distance? 

c. Show that if uq - ce lKx , where K is a constant, the wave will be of the form 

^ _ e iKx-(h/2m)Kh' 

If our particle is free to move in three dimensions, then as shown in 
sec. 11.9, 

h 2 

\p k = e tk ' T , and E k = —k 2 
2m 

Hence (114) has the form 

u = f c (k) e <tk ' r “ (h/2m dk 


(11-117) 



11.26 


QUANTUM MECHANICS 


398 


Again, if u = UQ(x } y>z) at t — 0 



whence by 3-dimensional Fourier analysis 

c(k) = ^3 j uo{k,ii£)e~*' p dp 


the sector p having components £, 77, f. 

Assume now, in analogy with the one-dimensional case, that 


u 0 = 


e -rV2a* 


At t = 0 the wave packet is a spherical concentration of waves centered 
about the origin; the probability packet has a similar shape and a width a . 
On inserting into the relation for c we have 



e 


— 0? 2 /2a 2 ) — 



e _ (f2/2a 2)_^ 


This gives 


M = ( 2 t ) ~ 3/2 a 3 f e -t(°V2)H(W2mM}+mr. dki 


times two similar integrals with k\ replaced by k 2 and fc 3 , x by y and z. Hence 



The interpretation of this result is not different from that of (116). 

Before leaving the subject of “ particle waves,” we should remark that 
every component wave of the packet (117), being of the form e l(k ’ r ~ 2ir *' 0 , 
travels in a positive direction along k. Had we chosen the sign as in eq. 
(Ill) and not as in (112), the waves would have been of the form 
6 »(kf+2m*)^ w } 1 j c ] 1 implies that they travel along — b. Since k h represents 
the momentum of the particle, the latter choice is an unsuitable one. We 
also note that the wave length X = 2 ir/k = 27rh/mv = h/mv conforms to 
the De Broglie formula. The phase velocity of the waves is v\ = hk/2m = 
mv/2m — v/2, but their group velocity, 24 defined as 2Tv{dv/dk) ~ v, is 
equal to the classical speed of the particle. 

24 For a discussion of group velocity, see Sommerfeld, A., “ Wellenmechanischer 
Erganzungsband,” Friedr. Vieweg & Sohn, Braunschweig, 1929, p. 46. 



399 


EQUATION OF CONTINUITY CURRENT 


11.27 


11.27. Equation of Continuity, Current. — If the state function changes 
in time in accordance with the Schrodinger equation 

Hu = ihii (11-118) 

will it remain normalized? If it does not, there occurs a destruction or 
creation of probability; while initially there was certainty of finding the 
particle somewhere in space, there might later be uncertainty, a situation 
which would clearly be physically untenable. Permanence of normaliza- 
tion, however, follows immediately from (118). For 

U*udr — J [u*u + u*u]dr = [uH*u* — u*Hu]dr 


because of (118), and the last expression is zero on account of the Hermitian 
character of H. 

Having shown that u*u is conserved we can define a probability current 
by subjecting r*r, which we will call p for the moment, to the equation of 
continuity 

J t + v • I = 0 (11-119) 


Whatever I turns out to be must be regarded as the current correspond- 
ing to the “ flow ” of the quantity u*u. We shall limit our consideration 
to the case of a single particle so that 

H = -|^v 2 + F( W ) 

although generalization to many-dimensioned configuration space is easy. 
Again because of (118) 

dp 


— u*u + u*u = - (uH*u* — u*Hu) 
dt ft 


ifi 


Vih 


— — (u*V 2 R ~ uy 2 u*) = V * I — ( u*Vu — uVu*) 


2m 

To satisfy (119) we must put' 


L2 m 


>] 


25 


ih 


I - - — (u*Vu - uVu*) 
2m 


( 11 - 120 ) 


It is interesting to observe that a state u which has no complex depend- 
ence on a space variable has no current associated with it. Thus, in the 

26 This form of I is correct so v long as the potential energy V is of the scalar form 
here used. When H contains a vGptor potential, A, the term (e/c) A must be added to 
the expression for the current here given. 



11.28 


QUANTUM MECHANICS 


400 


free particle problem, cos kx and sin kx represent stationary states, but 
e ikx and e~ tkx have currents. 


Problem. Compute I for the various regions of the barrier problems considered 
in sec. 11.10. 

11.28. Application of Schrodinger’s Time Equation. Simple Radia- 
tion Theory. — The cases in which eq. (118) can be solved exactly are not 
numerous and not very interesting. When the time equation (118) is not 
separable, resort must be taken to approximation methods, the most useful 
of which will now be illustrated. 

Let an atom, whose normal Hamiltonian function, free from all per- 
turbations, is Hoy be suddenly subjected to a light wave which adds a 
perturbing energy 

V(x,t) = — eFoXsinut (11-121) 


to H. Physically, this means the light wave is monochromatic and has 
frequency v = co/27r; its electric vector is along X and of amplitude F 0 * 26 
If V did not contain x and sin a )t in product form, eq. (118) with 
H = Hq + V would be separable: the fusion of x and t into V spoils 
separability. 

In solving (118) we use the following initial condition: At t = 0, when 
the atom was exposed to the perturbation F, the atom, was certainly in an 
eigenstate of the operator Hq, say in the state fa corresponding to the 


energy E\ which we shall take to be the lowest energy of the system. Or, 
if we wish to include the trivial time dependence of the state, we take 

u = 

(11-122) 

The solution of 


( H q + V)v = ihv 

(11-123) 

which we desire, is certainly available in the form 


® = Z^Ae~^ /h)t 

(11-124) 


x 


26 Eq. (121 ) is a valid approximation for the purpose at hand. It neglects the 
energy due to the magnetic vector of the light wave whose contribution is small com- 
pared to (121) in the ratio v/c , where v is the velocity of the charge composing the atom 
and c the velocity of light. For hydrogen, v/c is 1/137. Furthermore, eq. (121 ) implies 
that the wave length of the light is large compared with the size of the atom. Correctly, 


V = 


—eFox sin 



and we are omitting the term z/\. 


The legitimacy of this 


will be clear from the following analysis. 



401 APPLICATION OF SCHBODINGEB’S TIME EQUATION 11.28 

provided we let the coefficients c be functions of the time. This follows 
immediately from the completeness of the with respect to functions of 
the space coordinates. When (124) is substituted into (123), there results 

Icx(M + = Z(c x f?x*x + ih^ x )e~ iE ^ 

X X 

wherein each term H^/ X on the left cancels E x <j/ x on the right. Let us now 
multiply the remaining terms of the equation by i* and integrate over con- 
figuration space, remembering the orthogonality of the Then, after 
simple rearrangement, 

b k = - l XcxV k = i 2 , 3, • • • (11-125) 

ft x 


where, as usual, 

V kX = J nv^dr 

If the unperturbed atom has an infinite number of states, (125) repre- 
sents an infinite set of linear differential equations, which in general can 
not be solved. But we now recall that at t — 0, v = u\ which means that 
all cjc except c x were zero at that time. Thereafter c x decayed from 1 to 
some smaller value, while all other c’s grew from 0 to various finite values. 
We now limit our inquiry to times so small that c x is still sensibly unity, and 
the other c’s are small compared with it, although c x may be quite compa- 
rable with the time derivatives of other c's. This permits the approxima- 
tion of replacing every c\ on the right-hand side of (125) by its value at 
t — 0, while retaining every c k . The equation then beomes 

4 = - 7 V kie v»**-™ 

n 

To simplify writing we introduce the abbreviation 

Ek — E x _ 
h 

and observe that every cot > 0, since, as we are assuming, E x is the lowest 
energy state. In view of (121), 

Vn - —eFoXki sin ut — ^ieF Q x kX (e 1 ^ — e~~ Zut ) 
c* = ^ xjti [ e —>‘ - e l 'H-“)‘"j 


so that 



11.29 


QUANTUM MECHANICS 


402 


On integration, 


ieF 0 r ' 

Ck = ~2h Xkl \_ 




<*>k 


Wfc + 


-1- * 


7 ^ 1 


where we have at once adjusted the constant of integration so that c* = 0 
when t = 0. For physical reasons, only the first term in the square 
parenthesis need be retained because it alone can attain appreciable magni- 
tude, (Both co and co fc > 0.) In fact e* is large only when oo « co fc , and 
this fact is accentuated when c*, is squared: 


9 e 2 Fl * 1 2 1 — cos (cojk — o))t e 2 F 2 

~ Xkl ( a ,*-*) 2 ~ ih 2 



(11-126) 


We now interpret this result. The coefficient c* is, in view of (124), 
the fc-th probability amplitude in the expansion of the state function v at 
time t in terms of energy eigenstates of the normal atom. Hence because 
of sec. 5, | Ck | 2 is the probability that at time t the &-th energy level of the 
atom be excited; it is the “ transition probability from state 1 to state k 
when the atom has been exposed to monochromatic light of frequency 
o)/2t for t seconds. 

Many interesting conclusions of a physical nature can be drawn from 
eq. (126), of which only two will here be mentioned. First, the transition 
probability is proportional to the square of the matrix element connecting 
the states in question. Whenever x X k vanishes, | c* | 2 = 0. Hence the 
vanishing of xu is the criterion of a “ forbidden ” transition. In the second 
place, the transition probability is small unless co « which is the Bohr 
frequency condition. 

Problem. The reader may be surprised to find that | c* | 2 is not a linear function of £, 
as might be expected on physical grounds. Show that, when the incident light forms a 
continuous spectrum of uniform intensity, | cjt | 2 is proportional to t . (For this purpose, 
(126) must be integrated over cu from 0 to « ; but the integration may without appreci- 
able error be taken from — oo to + «>.) 


ELECTRON SPIN. PAULI THEORY 

11.29. Fundamentals of the Theory.— The theory so far developed 
describes the general behavior of atomic and molecular systems surprisingly 
well, hut it makes some false predictions, particularly with regard to the 
finer details of the energy states of atoms, the Zeeman effect, and the mag- 
netic properties of electrons. It was soon apparent that the state of a single 
electron could not be represented as a function of three space coordinates 



403 


FUNDAMENTALS OF THE THEORY 


11.29 


alone, but that another parameter was required whose interpretation was 
for some time in doubt. Most decisive in clarifying the situation was the 
spectroscopic observation of the doubling of the energy levels of a single 
electron: In all alkali atoms, for instance, two levels are found where the 
Schrodinger equation permits only one. The energy difference between 
these levels was such as would be produced by a small magnet of magnetic 
moment he/2mc setting itself once parallel and then opposite to the mag- 
netic field present in the atom on account of the electron’s revolu- 
tion. Also, the angular momentum corresponding to these two energy 
states was known to be different; it was equal to that caused by the elec- 
tron’s orbital motion, plus h/2 in one, minus h/2 in the other state. 

Uhlenbeck and Goudsmit suggested that the electron behaves like a 
spinning top having a “ spin ” angular momentum of magnitude h/2 
which, however, can only add or subtract its whole amount, in quantum 
fashion, to any angular momentum the electron already possesses as a 
result of its orbital motion. Correspondingly, the electron generates by its 
spin a magnetic moment of magnitude he/2mc (m is the electron mass, c the 
velocity of light), and this also communicates itself in toio , either parallel 
or in opposition, to any magnetic moment already present. 

To describe the electron spin as an angular momentum of the usual kind 
and to associate with it an operator like L (eq. 44) proved a fruitless under- 
taking, chiefly because L would have more than two eigenstates. The most 
successful procedure of including the spin in the quantum mechanical for- 
malism, aside from Dirac’s relativistic treatment of the electron, is that of 
Pauli which will now be described. What follows will refer only to the 
spin states of a single electron; some applications to several electrons may 
be found in secs. 34 and 35. 

Since the three space coordinates are insufficient to specify the complete 
state of an electron, we introduce a fourth, the “ spin coordinate,” and 
denote it by s z . It corresponds, in classical language, to the cosine of the 
angle between the axis of the spin angular momentum and the Z-axis of 
coordinates. This visual interpretation, while in no way dictated by the 
mathematical formalism, will be found a useful mental aid. Thus the 
state function of an electron has the form 

Since in all that follows, the hypothetical spin coordinates s x and s y are 
never needed, we shall henceforth delete the subscript z on s, bubretain the 
above interpretation. Hence <j> = <£(#,?/, 2, s). Finally, it is well for the 
moment to abstract attention entirely from the space dependent part of 
the wave function, i.e., to consider x , y> z as fixed, concentrating our inquiry 
solely upon the electron spin. Tb.en<£ = <j>(s). 



11.29 


QUANTUM MECHANICS 


404 


If s, like x , y and z, were permitted to assume a continuous range of 
values, difficulties would result. Pauli therefore postulates — in a manner 
admittedly ad hoc and designed to force success of the theory — -that the 
range of s consists of only two points: s ~ ztl (classical meaning: spin 
vector is parallel or in opposition to Z). A function of s is therefore 
defined only at these two points. The most general spin function is, 
accordingly, 

4>(s) = , — i (11-127) 


where the 5’s are Kronecker symbols. 

Our postulates involved certain integrals over configuration space. 
But an integral over configuration space consisting of two points vanishes. 
It becomes necessary to redefine the integral as a summation over the two 
points: 

f F(s)ds = F(-l) +F(1) 

If </>(s) is to be normalized, 

/(I a \X+i + ! b \X.-i + ( a*b + &*a)6,, +1 5,,_x)ds 

= | a | 2 + | b l 2 = 1 (11-128) 

In a very trivial sense, eq. (127) represents an expansion of a function 
4>(s) in a complete orthonormal set of functions, and To what 

operator do these two functions belong as eigenstates? The answer is 
suggested by intuition arid will be justified by its complete success; it is 
the operator S z which is associated with the observable: spin angular 
momentum along Z. We must now give thought to the mathematical 
structure of this operator. 

Empirical evidence cited in the introductory paragraphs demands that 
its two eigenvalues be ±h/2 . Hence it must satisfy the two equations 

S z 5 a ,__i = 

It is possible to show that no differential operator of the type encountered 
previously can satisfy these equations without giving rise to an infinite num- 
ber of other eigenstates. But why search for the operator? The simplest 
point of view, and that here taken, is to regard eqs. (129) as a definition of 


2 5 *' +1 

2 5 *- 1 


(11-129) 



405 


FUNDAMENTALS OF THE THEOKY 


11.29 


the operator S z . 27 The result of applying S z to the most general function 
of s (eq. 126), can be constructed on the basis of (129), hence (129) exhausts 
the meaning of S z and is its definition. 

To simplify the notation, and to be in accord with custom, we now intro- 
duce the symbol <*($) for and f3(s) for d 8 ,-v Furthermore, we 

define a new operator 


<r z 



which has eigenvalues dbl, for the simple expedient to save writing. Then, 
in view of (129), 

<r 2 a(s) = MW = -0W (11-130) 


It is indeed possible and often useful to find an explicit operator in form 
of a matrix which will satisfy these equations. This matrix is easily formed 
by means of the principles outlined in sec. 17. Our eigenstates are \pi = a, 

fa - ft and we construct (<r z )ij = J ^fa z \pjdT with the integral replaced 

by a summation. We thus obtain the two-square matrix 

'■ - (o -°) < ll - m> 


To let it operate on what was formerly the function cj>(s) the latter has 
to be regarded as a vector whose components are its expansion coefficients : 
If the function <j> is given by 

<j)(s) = aa + b0 

a and b being numbers, then the vector <f> (s) is 



Thus, in the matrix representation, 

- Q -X) (u - 132) 

and the reader will easily verit y oy the rules of Chapter 10 that the two 
eigenvectors of a t are <j> = (fj and 4> = f ^ , where the values of both a 


27 An operator P is in general uniquely determined when the result of its action 
upon each member of an orthonormal set of functions is known. This method of defin- 
ing an operator is ordinarily not useful because an infinite number of relations like (129) 
would be required. 



11.29 


QUANTUM MECHANICS 


40f, 


and b must be unity because of (128). The eigenvalues are, respectively, 
+ 1 and — 1 . But the functions <j> corresponding to the vectors 0 and^^ 


are clearly a and 0, which takes us back to the scheme (130). 

It is seen that there is a complete isomorphism between the two descrip- 
tions of the operator S z and its eigenstates 0: One in terms of matrices 
and eigenvectors, where the rule of operations is (132) ; the other in terms 
of linear substitution operators and eigenfunctions, where the rule of opera- 
tions is (130). 

The question now arises as to the structure of the operators S x and S V) 
associated with the other two components of the spin. 28 In endeavoring to 
construct them it is important to recall one significant fact concerning the 
ordinary angular momentum L: its components do not commute with one 
another. In fact (see eq. 7) 


LxLy LyLx — ifil^Zf LyLz X-JzLy ihL x 

L Z L X — L X L Z — ihLy 


Let us assume that the components of the spin S, this being an angular 
momentum operator, must be subject to the same commutation rules. In 
terms of or rather than S, we postulate 

cr x (Ty — <T y <T x = 2 i<x z ; cr y a z — a z <r y = 2ia x ; cr z a x — <r x a z = 2 ia y ( 11 - 133 ) 


These relations imply that an eigenstate of S z , e.g., a(s) or /3(s), cannot bea 
simultaneous eigenstate of S x or S y (see. 7). 

The construction of a x and <r v , a z being given, is more easily performed 
in the matrix scheme. If we set ourselves the problem of determining two 
matrices a x and %, which, when combined with (r z of eq. (131), obey (133), 
we easily find that the answer is not unique. But certainly the solution 


<t x = 



(11-134) 


is a possible one. The ambiguity here encountered permits just enough 
freedom to make possible a rotation of coordinate axes (see Chap. 15) . 

Let us, then, accept (134) as our solution in matrix form. Clearly, a x 


has eigenvalues ±1, eigenvectors and 40 ; <r y has eigen- 
values =fcl, eigenvectors and The observable values 


28 While we need only one spin coordinate , $ t , all three components of the operator 
must be introduced because they appear in the Hamiltonian and other operators. 



407 


FUNDAMENTALS OF THE THEORY 


11.29 


of all three components S x , S y and S 2 are therefore zkh/2. When these 
results are translated into the function language they read as follows. 

The equation <x x <t>(s) = X<£(s) has two possible (normalized) solutions: 


X = 1, 4>(s ) => ^[^(s) + p (s)l 1 

X = -1, <p(s) = V|[a(s) - j8 («)] J 


The equation oy£($) — \4>(s) has two possible solutions: 

x = 1, <p(s) = V / ^fa:(s) + | 

X = — 1, ct>(s) — V^[a(s) — ] 

The equation <r z <£(s) = X<£(s) has two possible solutions: 


(b) (11-135) 


X “ 1, <t>(s) = a(s) 

X “ ~1, *(*) — i8(«) 


If now we write the eqs. (135a) in the simpler form 

(T x <x + <?xP - a + /3, o- x a — cr x /3 = - (a — 0) 


and solve these by adding and subtracting, we find 


g x ol = f3, — a 


The same procedure applied to (135b) and (135c) yields similar relations. 
Summarizing these results: The operators <r Xl cr yj <r z may be represented 
either by the set of linear substitutions 


a x oc = /3, ayCt = i/3, <r z a = a, 

cr x P = a] <t v P = —ia) Cgp = — j8 


or by the matrices 





(11-136) 


(11-137) 


For practical use, the set of substitutions is to be preferred. 
Note that the operators 

— 2 (** + iff v) 

and cT =s J(<r x — i(T v ) 

satisfy the convenient relations 

0 - 4 * 0 , = 0 = P 

<f*~P = a <t~P — 0 


They are sometimes called “ displacement operators.” 

We return to the consideration of the general state function of an ele<y 



11.29 


QUANTUM MECHANICS 


408 


tron, which includes x , y, z and s as arguments. Such a function may 
certainly be expanded in eigenfunctions of <r z , i.e., 

<t>(x,V,z,s) = <t>+(x,y,z)oi(s) + <t>~(x,y,z)p(s) 

Normalization now requires 

J <t>*<j>d,T = YhJ (j>*<t>dxdydz = J* (<£*<£ + + <j>t<f)-.)dxdydz = 1 

The operators a x , (t V} <t 2 do not act on 0+ and <£__ which are only functions of 
x ) y, z ; in other words, they commute with space coordinates. Thus, for 
instance, 

~ + 0 * 2 = <t>+Vy& + ~ ify-CL 

In the matrix scheme, <t>(x f y } z f s) is represented by the vector 

U + (x,y,z)\ 

\4>-{x,y,z)) 


<t> = 


In the sense of this analysis it may be said that the introduction of the spin 
in the Pauli manner causes all Schrodinger functions to become two-com- 
ponent functions. 


Problem. Carry out the algebra involved in finding the two Hermitian matrices 
(134). 

ii.30. Applications. — a. Atom in a Magnetic Field . Our interest here 
is not in a complete solution of this problem, which may be found worked 
out in most books on quantum mechanics, but in its salient mathematic 
features. We wish to find the energies of a one-electron atom (e.g., hydro- 
gen or, with good approximation, the alkalis) when it is placed in a uniform 
magnetic field. The Hamiltonian consists of two parts, one acting on the 
electron's space coordinates and one acting on the spin coordinate. The 
former will be called Hq ; the latter is the “ spin energy." If the magnetic 
field DC is taken along the Z- axis, the classical energy of a particle of mag- 
netic moment p. would be p, • DC = y z DC z . But empirically, the magnetic 
moment associated with the spin is ( he/2mc)o‘ . We shall here write y for 
the constant fte/2 me. In quantum mechanical transcription, then, the 
“ s Pi n energy ” is yDC z <r z where <r z is interpreted as the operator (130) or 
(131). The Schrodinger equation becomes 

(H 0 + »DC z a z )* = E* (11-138) 

Let 

*(z,y,z,s) = $+(x f y,z)a(8) + y,z)P(s) 

and substitute, obtaining 

a(s)[H 0 + yDC z - + /3(s)[tf 0 - fiC z - E\^ = 0 



409 


APPLICATIONS 


11.30 


provided relations (136) are used. Since a and ft are linearly independent, 
orthogonal functions of s, their coefficients in the last equation must 
separately vanish. 29 Hence we have 

Hot+ = (E - g3C 2 )^ + , Hot- = C® + (11-139) 

Now let Eq be an eigenvalue of H 0) to the corresponding eigenfunction. 
The first of eqs. (139) (which is nothing more than an eigenvalue equation 
for the operator H 0 ) then says E — fjD( z = E 0) or E = Eq + /i3C 2) t+ ~ ^o- 
On substituting this value of E into the second equation it reads Hot— — 
(Eq + 2fxX z )t-j and this can only be satisfied by putting t— = 0 because 
Eq + 2j \lK z is not an eigenvalue of H {) . Thus we obtain as one solution 
of (138) 

E = Eo + ^ == ^ 0 (^y,2)of(s) (ll-140a) 

But we can also start with the second of eqs. (139) and assume t- to be 
toj E + j!K z to be Eq. Then t+ — 0 and we have 

E = Eq - $C Z , y = to(x,y } z)0(s) (ll-140b) 

How does the inclusion of the spin modify the eigenvalues and eigen- 
functions of the Schrodinger equation when there is no magnetic field? 
The answer is obtained by letting 3C Z vanish in (140a, b). Eoth values of E 
coalesce to Eq which now represents the ordinary Schrodinger energy in 
the absence of a field, but the functions remain distinct. The spin thus 
introduces a degeneracy into the Schrodinger representation of states. 
Formulas (140) account — in a primitive way — for the doubling of the 
alkali energy levels, the field 3f 2 being caused in that case by the electron's 
orbital motion, and not by external agencies. 

Problem. Solve eq. (138) by the method of separation of variables, i.e., by putting 
\p = i p(x,y,z)<t>(s) t and show that (140) is the solution obtained by that method also. 

b. A Spin Problem . Having shown how spin and coordinate functions 
cooperate in the description of the state of an electron, let us omit further 
reference to space coordinates and inquire what are the energies which an 
electron, placed in a uniform magnetic field of arbitrary direction, may 
assume regardless of its translational motion. The only energy of interest 
is that due to the spin. Let 3fC be the magnetic field strength. The 
Schrodinger equation reads 

ytK ert = M (3C-c<r* + < 3C y <r y + W z<? z)t( s ) ~ Et($) (11-141) 

If 3C is taken along Z, the equation reduces to 

fIKa z t(s) = Et(s) (11-142) 

29 This can be seen explicitly if the equation is multiplied by cither or 0(«) and 
then “ integrated ” over s. 



11.30 


QUANTUM MECHANICS 


410 


The operator on the left is but a constant multiple of a z and must the3*^ ore 
have the same eigenfunctions as <r Zf i.e., a and /3. The corresponding eig eQ - 
values are at once seen to be E = ±jTK. W.e shall show that eq. (141) has 
the same eigenvalues, but different eigenfunctions. 

Make the substitution \f/ = oa(s) + 6/3(s) in eq. (141). On using, 
subsequently, relations (136) the result will be 

fi{1K z (alS + ba) — iVC v (a(3 — ba) -\- t 3C z (aa — 6/3)} — E (aa -f- b/3) == 0 

As before, the coefficients of a and may be put equal to zero separately, 
so that 

nOQ x a - OC ya - DC z b) = Eb\ 

MPCjb + Mjb+VC*) = Eaj (ll-14d) 

If the equations are to have solutions a , 6, which are different from zero, the 
determinant of the coefficients of a, b must vanish, whence E = - 4 -r 
On substituting E = +ju!)C into the first of e(is. (143) and then taking the 
square of its absolute value, we have 

OK’ +301 a | 2 = (VC + VC,) 2 \b\ 2 

Let us call the angle between VC and the Z-axis, 8, so that VC\ +- VK/y = 
VC 2 sin 2 8, and 3C 2 = VC cos 8. Furthermore, in view of (128), | b | 2 = 
1 — | a | 2 . When these substitutions are made and the last equation is 
solved, the squares of the absolute values of a, b are found to be cos 2 6/2 
and sin 2 8/2, respectively. Let us then put a = cos 8/2, b = e iS sin 6/2, 
treating 5 as a phase constant. With the further substitutions = 

H sin 8 cos <j>, VCy = H sin 8 sin <f>, where <t> is the azimuth of the field , we 
find from (143) that 8 - —<$>. 

In a similar way, when E = ~iiH, 8 = % - <p, a = sin 8/2, b = — e~ 
cos 8/2. 

We conclude that eq. (141) has the eigenvalues E\ = nVC, E 2 = — fJ-VC, 
and the corresponding eigenfunctions 


<Ai = cos- - a(s) + sin-c ilp /3(s ) 

2i 2 

6 0 

'P 2 = sin - • a(s) — cos - e~ i4, fi(s) 
£ 2 


(11 — 144) 


Notice that, when the field 3C is reversed in direction (i.e., 0 
0 + **), and \f/ 2 exchange their roles. 


7T d, 



411 SEPARATION OF THE COORDINATES OF THE CENTER OF MASS 11.31 


Problem. Solve eq. (141) by diagonalizing the matrix 
<nr , . ozr 

'-P- jPx ~TJ\.yCry "r »-A. Z&Z — ^ 

and show that it leads to the same results. 


3C, - iVC y \ 
-3C. ) 


THE MANY-BODY PROBLEM AND THE EXCLUSION PRINCIPLE 


11.31. Separation of the Coordinates of the Center of Mass. — In 
classical mechanics, a system containing many particles and subject only 
to internal forces behaves in such a way that its center of mass moves uni- 
formly on a straight line. As a corollary of this theorem every classical 
two-body problem may be reduced to a one-body problem. 30 A similar 
fact may be proved in quantum theory. 

The Schrodinger equation for a system of n particles of masses 
mi, • • •, m n reads: 

(-?£ v - +y )*- E * <u - i45) 


where V? = d 2 /dx 2 + d 2 /dy 2 + d 2 /dz f. The potential energy, V, is to be 
regarded as a function of the relative coordinates Xj — yj — y t -, Zj — z,-. 
We first transform to a new set of coordinates, defined as follows: 


1 n « 

X « ~ L M =2 1 mi 


(11-146) 


— X2 X , X , 


X n = X n 


X 


with similar relations for the y and z components. Note that x[ is missing; 
the coordinates of one particle have been eliminated by the introduction of 
the center of mass coordinates X , F, Z . In computing the sum of the 
Laplacian operators occurring in (145) in terms of the new coordinates we 
observe: 

dX ___ dY _ dZ _ rrti dxj _ dy'j _ dzj ♦ m $ - 
dXi dyi dZi M 9 dXi dyi dZi ” M 

Using these relations, simple differentiation yields 
dxl M 2 \dX 2 hdXdx'i i&dx'idx'J 

= /gV __ „ A ^ ~ jV \ 

dz? Af 2 W 2 2 + 

M \dX<te< 



30 So long as relativity effects are neglected. 



11.31 


QUANTUM MECHANICS 


412 


and similar expressions for the derivatives with respect to y and z . 
these are combined we obtain, in place of (145), the equation 

fi 2 /2 , h 2 Ji / d 2 . d 2 . <9 2 

2m 


When 


ft 2 71 

- — v 2 - L 

2M V 2 


10 ^ ( d 


= E>p 


dx- 


+ 


+ 


<%/»•%> 32' 




+ v * 

(11-147) 


Here V 2 is the Laplacian with respect to the center of mass coordinates, 
Vi 2 with respect to the primed coordinates. While V is not directly a 
function of the primed coordinates, it may be expressed in terms of them 
because Xj — Xi = x\ — x[. A difficulty might seem to appear in connec- 
tion with Xi — X\ because x[ is absent from the primed set. But it is easily 

n l n 

seen that m\x[ = — whence Xi — X\ = H There- 

2 rrii j 

fore 7, when expressed in terms of the new coordinates, will not contain 
X, 7, or Z. 

As a result, eq. (147) is separable; therefore ^ may be written as 

*(X,y,z).<K^‘ ••*')• 

Correspondingly, E = E c + E', where E c is the energy associated with 
^(XjY } Z), determined by 

ft 2 0 


This is the Schrodinger equation of a free particle of mass M } it pro- 
duces, as we know, no quantization. The remainder of (147) describes 
the internal motion of the particles : 

(u - 148) 

It differs from the normal form of Schrodinger^ equation by the presence 
of the terms in Vi • Vj and by the fact that V has a different functional form 
in the primed coordinates than in the unprimed ones. 

The coordinates (146) measure the position of the i-th particle relative 
to the center of mass. It is also possible to use a less symmetrical but 
physically more useful set of coordinates, which is closely related to (146). 
If we put 

1 n n 

X = — X m&i, M = £ m,- 

Mi 1 

x' 2 = X 2 - * 1 , x' 3 = X 3 - Xi, • • •, x' = Xn - X 1 (11-149) 


thus measuring all coordinates relative to that one which has been elimi- 
nated (zi), we obtain in the same manner the equation 

v; ■ v; + r }* - e * <u - i5o > 



413 SEPARATION OF THE COORDINATES OF THE CENTER OF MASS 11.31 


This form is particularly useful when it is desired to calculate the energy of 
a many-electron atom, for particle 1 may then be taken to be the nucleus and 
the summations in (150) are extended only over the electrons. The equa- 
tion remaining after separation of the motion of the center of mass is now 

-?(?^ v;!+ i:5 7;v: ) + i*- £V 

where m is the mass of an electron, that of the nucleus. It may be 
written in terms of the reduced mass 


as follows: 


m + mi 


gv;..v; + y)*- E v 


(11-151) 


The terms in the double summation play an important role in the isotope 
effect of heavy atoms. 31 They are present whenever the number of elec- 
trons is greater than one. For the case of hydrogen, eq. (151) has the same 
form as Schrodinger’s equation for a stationary nucleus, except for the 
replacement of the electron mass by Hence the true energies of the 
hydrogen atom are not exactly given by eq. (64), but by that equation with 
ju written for m. 

Note that the function V is different in (148) and (151), and that the 
terms of the double summation have opposite signs. Nevertheless the 
equivalence of these two equations for the two-body problem may be seen 
as follows. Write for the potential energy in (151) 

V = F(a/,?/,z / ), where x* = x 2 — x lf etc. 


The T-function of (148) must then be expressed in terms x 2 — X, y 2 — F, 
z 2 — Z. Now x 2 — xt = ( mi ( x 2 — X). Therefore we must use 


in (148) 


-K : 


mi 


mi + m 2 f m\ + m 2 t mi + m 2 






mi 


V : 




=') 


and the equation reads 

v /2 + V(ax',ctyW)\'Kx',v',z') - E'Mx',y',z') 

L 2 \m 2 mi + m 2 / J 

81 See Hughes, A. L.. and Eckart. C., Phys. k*x. 3ft, 694 (1930). 



11.32 


QUANTUM MECHANICS 


414 


where a = (mi + m 2 )/mi. If here we put ax' = x n , a y' = t/', as: 7 — a:", 
it becomes 



m\ + 


l— V 

■f ^2/ 


V 7 ' 2 + 7(*'V 


V')]* = E'4 


which is identical with eq. (151). 

11.32. Independent Systems. — Physical systems are independent, or 
isolated from one another, if the Hamiltonian operator of one contains no 
terms referring to another system. There is then no interaction between 
them. Consider n independent systems, and let the coordinates of the 
r-th system (including the spin coordinate) be symbolized by the single 
letter q r . If its Hamiltonian operator is H rj its Schrodinger equation will be 

ffr#((Zr) = E?W(qr) (11-152) 

2J) r) being the i-th eigenvalue of the r-th system. 

The state function describing the entire assemblage of n systems will 
satisfy the equation 


(Hi + H 2 + - ■ • #»)*(<?!, <Z2,' • • ffn) = #*(<?!, <72,- • • Qn) (11-153) 


To find its solutions we put #(gi,gv * * Qn) = ^ (1) (gi)^ (2) (g 2 ) * * • ^ (n) (g n ) 
tentatively. Substitution in (153) and use of the fact that Hi acts only on 
q h etc., leads at once to the equation 


HvP a) , , Hrd™ 


= E 


which shows that each term H r ip' r) /^'' r) is separately a constant, say E ir) , 
and that the sum of all these constants is E. But if H r \p (r) /\p {r) = E (r \ 
then i^ (r) must be one of the set of functions defined by (152), and E (r) 
one of the energies E\ r) . Therefore 

*(9i.92r ••?«)= i$ V (qi) ■ ij 2 ) (92) • • ■ & (n) (9«) 

E = E\ 1] + Ef> + • • • E { r ] 11 154 


This result is indeed what intuition would lead us to expect. For 
clearly the total energy of a number of isolated systems is the sum of the 
individual energies. Furthermore, if w\ is the probability that system 1 be 
found at q\ ) w 2 that system 2 be found at q 2 , then the probability that both 
of these statements be true simultaneously is the product w x w 2 . Hence 
the individual ^-functions, whose squares are these probabilities, must like- 
wise combine as factors. 

This latter circumstance is dictated also by the time dependence of the 
Schrodinger states eq. (113). For only the product of the individual 


s 


PH 



415 


THE EXCLUSION PRINCIPLE 


11.33 


functions \ i/' (2) e _(,yft)E<2> ‘, etc., will have the factor e~ 

required in ^(q\,q 2 ,- • • g n )e~ <t/wffi . 

11.33. The Exclusion Principle. — When two independent systems 
occupy the energy states EP and E$ 2) respectively, the combined system 
has an energy 

E = Ei X) + Ej 2) 

and a state function 

* = 'Pi 1) (qi) • ^j 2) (92) (11-155) 

We shall suppose for the moment that the individual states ipP and ip} 2) 
are non-degenerate. Then, unless there happen to be two energies 2£/ a) 
and EP whose sum is precisely the same as EP + Ej 2 \ the combined 
state (155) will also be non-degenerate. This will generally be the case 
when the two systems are different in a physical sense. 

But if they are similar, e.g., both electrons, or both hydrogen atoms, 
another situation arises. We may then drop all superscripts in the de- 
scription of the states, and write (155) 

E = E { + Ej, * - Mqi) * *y(ff 2 ) (11-156) 

This state is degenerate, although \ pi and \pj are not; for if we interchange 
the indices i and j, or what is the same, interchange the coordinates q\ 
and q 2 in there results a different ^-function but not a different energy. 
This degeneracy, which is peculiar to the description of any aggregate of 
similar systems, is known as exchange degeneracy. Classically it implies 
that the energy of the total system is unaltered when two individual con- 
stituents exchange places and spins . 

In the more general case where Ei has gi and Ej has gj linearly inde- 
pendent functions associated with it, the number of corresponding to E 
will be, not Mi> but %9iQj- 

Returning to the case of non-degeneracy of \pi and \pj we note that the 
two functions 

'f'j = iMsiWiCffa), 

which are linearly independent, are equally good representatives of the 
state in which E = Ei + Ej. .Moreover, any linear combination of the 
two satisfies the Schrodinger equation for this value of E, and has just 
claim to be considered. Of course, only two such combinations can be 
linearly independent. Let us then consider the function 

CP®' 7 -f- 6^/7 

where we shall assume ] a | 2 + | 5 ] 2 = 1 to assure normalization. On 
“exchanging” the two systems, */— and SE'r/— >'1' r, hence the 



11.33 


QUANTUM MECHANICS 


416 


function above transforms itself into 

by i + a^jr 

the numerical value of which for any given configuration (gi,g 2 ) will in 
general be different from a¥j + by u. Physically, this implies that the 
configuration which results when the two systems exchange places has an 
altogether different probability than the original, a consequence that is 
clearly objectionable. 

However, among all linear combinations there are two which avoid this 
dilemma. They are the symmetric 32 combination 

^s(QhQ2) = + y X i) 

and the *' antisymmetric ” one 

yA(quQ2) = — y n ) 

They are independent and indeed orthogonal; the first remains unaltered 
on exchange of systems, the second changes its sign. Both, therefore, yield 
probabilities | ^ | 2 which are insensitive to exchange. 

Consider now, not two, but n independent similar systems, in states 
fa, 'i'h " 'h- The assemblage has the energy E = E{ + Ej + • • • E ay 
and is described by the state function 

*(qi ,g 2 ,* • * qn) = Mqi)h(q 2 ) • * • h(q n ) (11-157) 

But every permutation of the g’s among the ^ J s on the right will produce a 
new function belonging to the same E , provided the subscripts, i } j, • • • s 
are all different (which we shall assume for the moment). Hence, if P 
represents any one of the n! possible permutations of the and 
y p(gi,g 2 ,• ■ * q n ) the function which results from (157) when this jtermu- 
tion is made, then 

*(<11,22, ■ • ■ Qn) = Xap*p (11-158) 

P 

where the dp are arbitrary constants, one for each permutation (arbitrary 
except for the normalization condition), represents an acceptable state 
function for the energy E. Since there were originally n\ linearly inde- 
pendent functions, there will also be n\ linearly independent combinations 
of the type (158). 

Fortunately, most of these are uninteresting, for they cause 

I *(Qi,Q2,- • • qn) I 2 

82 A function is said to be symmetric with respect to a given operation if the opera- 
tion leaves it unchanged; it is said (in quantum mechanics) to be antisymmetric if the 
operation changes its sign without altering it in any other way. 



417 


THE EXCLUSION PRINCIPLE 


11.33 


to change when an exchange is made among any of the q’s. There are 
certainly two combinations, however, which preserve probabilities on 
exchange. One is the symmetrical, the other the antisymmetrical combi- 
nation. The symmetrical one is formed by making all the coefficients ap 
in (158) equal: 

*s (9i,32,- ■ • 9») = (tt!r V2 Z*p (11-159) 

P 

the antisymmetric one by giving opposite signs to even and odd permutations 
(cf . Chapter 15) : 

iM9i, 92,' ••«»)“ (n’r 1/2 Z(-l) P ^ (11-160) 

p 

A practical way of constructing (160) is to write the determinant 

4 >i (Ql) fc ( 92 )^ 1 ( 93 ) • • • Mqn) 

( 92 )^ ( 93 ) • • • i'j(qn) 

....................... (H-160') 

^(91)^. (92)^» (93) • • • &,(9n) 

which the reader will easily recognize as equivalent to the expansion (160). 

It is to these two functions, and a , that we must confine our 
attention. Lest the simplicity of our formalism obscure significant details, 
we recall that q r stands for all coordinates of the r-th system. Thus, if 
the systems were electrons, $j(q r ) would be an abbreviation for a combi- 
nation of space and spin functions: 

^?j_j_(x r ,2/r,2r)^(Sr) (£r>2/rj2r)/?(Sr) 

in the notation of sec. 29, and an interchange of q r and q p means that x r 
is to be exchanged against x p , y r against y pt z r against z p and s r against s v . 

There is no a priori way of deciding which of the two functions, (159) 
or (160), is preferable. But here the exclusion principle, early recognized 
by Pauli, creates simplicity in a most effective way. It states that if the 
individual systems belong to a certain class (see below), only antisymmetric 
functions may be used in describing the assemblage. This principle is of the 
nature of a postulate; it has not yet been deduced from more fundamental 
axioms, although one might hope, from a mathematical point of view, that 
this will prove possible. 33 Why nature insists upon antisymmetric states 
for some and symmetric states for others among its creatures is at present 
a puzzle. 

The elementary systems to which Pauli's principle is known to apply 

33 A very searching and interesting examination of the principle in the light of other 
fundamental issues has been given by Pauli, Phys. Rev. 58, 716 (1940). 


= (n!) -1/2 



11.34 


QUANTUM MECHANICS 


418 


are: electrons, positrons, protons, neutrons, neutrinos and mu-mesons; 
photons, on the other hand, and several kinds of meson, are described by 
symmetrical state functions. 

Perhaps the most important consequence of the exclusion principle is 
this. Suppose our assemblage consists of electrons, two of which are 
described by the same function i fa (i.e., the functions are identical with re- 
spect to positional and spin factors). The determinant (160') will then 
have two equal rows, and hence will vanish. We may therefore say: two 
systems obeying the Pauli principle cannot be in the same state. This fact 
governs the structure of atoms and molecules; each electron added to the 
shell of an atom must have its own set of quantum numbers. 

The exclusion principle makes it impossible to distinguish two states 
which differ only by an interchange of two constituent systems, a fact which 
has already been noted. 

Photons, which are described by the symmetrical function (159), may 
exist in identical states, because that function does not vanish when two 
sets of indices like i and j, contained in become equal. 

11.34. Excited States of the Helium Atom. — To show how the Pauli 
principle is applied we treat some of the excited states of the helium atom. 
The latter is to be regarded as a simple assemblage of 2 electrons moving in 
the Coulomb field of the nucleus (and under their mutual repulsion), 
hence the considerations of the foregoing section apply. However, in the 
first part of our treatment we shall ignore both the electron spin and the 
exclusion principle. 

The Schrodinger equation has already been given (eq. 90); it is 

i + H 2 + ¥ = E9 (11-161) 


where 


Hi = 


ft 2 , 2e 2 

IT v? 

2m Ti 


If the term e 2 /ri 2 were absent the two electrons would be independent, and 
^ would be a product of the form yj/iiqi) ■ ^($ 2 )> E being Ei + Ej. More- 
over 1 pi and yfrj would be hydrogen eigenfunctions with atomic number 
Z = 2, for Hi and H 2 are Hamiltonian operators for a single electron in a 
Coulomb field. To retain the notation of sec. 19 we shall now write u for 
the individual electron functions, so that, in the absence of the interaction 
term, 

¥ = U'i(x 1 y 1 Zi)uj(x 2 y2Z2) (11-162) 


Functions of this type will be used as variation functions with the com- 
plete Hamiltonian (161). Let us first give thought to the proper choice of 



419 


EXCITED STATES OF THE HELIUM ATOM 


11.34 


the individual functions u . The state corresponding to the lowest energy 
of a single electron is (cf. eq. 67a) 


%o = 



1/2 

e ~ 2r/a • 


(11-163) 


We are writing here, in place of the single subscript i, the values of the two 
quantum numbers n = 1 and l = 0. The first excited state is either 


U20 = ^2o^o(^) 
or 

U2\ = RnYtiB#) 


The spherical harmonic Y 0 is a constant, but Y\ is any linear combination 
of the three functions Pj(cos d)e xtp ) P?(cos 6) and P}(cos d)e~ t<p . It will be 
convenient to choose the following normalized combinations 


Yx - 



[Pj (cos d)e x<p + P} (cos 6)e lip ] 


4 


sin 6 cos <p 



Yy 




[Pi (cos S)e x(p — P}(cos 9)e~ 



sin 6 sin <p 

y 

r 



and to define 34 



^20 — ^20^0 
U2x = P 21 Y x 

U2y = K 21 Yy 
U2z = R2lY z 


(11-164) 


as the four independent, orthonormal functions describing the first excited 
state of the one-electron system. The product (162) can be formed by 
combining Uiq with any one of the four functions (164); furthermore, the 
arguments can be interchanged in each of the functions thus constructed. 
We are therefore concerned with the following eight functions, each of 
which is a solution of eq. (161) with the term e 2 /ri 2 deleted, and belongs 
to the energy 

2e 2 

E 0 = (1 + i) = 5 E h (11-165) 

Go 

34 fa l is given in eq. (65) ; its explicit form will not be needed here. 


11.34 


QUANTUM MECHANICS 


420 


Pi = «10 ( 1)^20 ( 2 ) 

p3 = «10(l)tt2.(2) 
Ps — Wio (l)«2y(2) 
p7 - ttlo(l)tt 2 ,( 2 ) 


Ps = «2o(1)«io(2) 

Pi = «ta(l)Ulo(2) 

Pg = Wl)«io(2) 

= « 2 .(1)«,o(2) , 


( 11 — 166 ) 


In writing them we have indicated the arguments (a: 1 jf 1 z 1 ) and (pc 7 .^ 2 ) 
simply by (1) and (2). A combination of these functions 

8 

4=2 avh 

x =1 


will be used as a variation function in the sense of sec. 20. The best ener- 
gies of the system are given by (97), and this reduces at once to the form 
(104) because the p\ are orthonormal and belong to the operator 
H° = Hi + Hi. The perturbing term is 



r 12 


The next step in the solution of our problem is the calculation, of the 


matrix elements 



'pjdxidyidzidxidy 2 dz 2 using the functions 


(166), 


the details of which may be left for the reader. 35 Symmetry arguments 
may be used to show that 


H'u = # 22 , HL-HL, HL - HL, HU = Hi 


1 88 


and that only functions in the same line of (166) give non-vanishing ele- 
ments. Furthermore the volume element adopted in the evaluation of I 
(sec. 19) is convenient in proving: 

#33 = #58 = # 77 ; HU = HU = Hi 8 

Since the p\ are real, H' ; = H#. We are left, therefore, only wit. la the 
following matrix elements : 


HU = 

1. 

f u?o(l)i4o(2)£dr^7 

HU = 

1 uio(1)u2o(2) — u 2 o(1)u 10 (2)cIt s K 

L 

' r 12 

II 

CO 

f tt?o(l)4*(2)— J' 

t n 2 

HU = 

f'ui 0 (l)u 2x (2) u 2 ,(1)uio(2) s K ' 


K See Heisenberg, W., Z. Phys. 39, 499 (1926). 



421 


EXCITED STATES OF THE HELIUM ATOM 


11.34 


In a sense previously defined, (see sec. 11.21) J and J' are Coulomb inte- 
grals, K and K' exchange integrals. 

The determinantal eq. (97) becomes 

J -<■ K 0 0 

K J - e 0 0 

0 0 J' -e K' 

0 0 K' J' - e =0 

J' — t K' 0 0 (11-167) 

K' J' - e 0 0 

0 0 j'-eK' 

0 0 K' J' — e 

provided we write e for E — E 0 . All elements not written are zeros. The 
determinant has two single roots: ei — J — K, e 2 = J + K f and two 
triple roots: e s = J' — K e 4 = j' + K f . The perturbation e 2 /**i 2 may 



Fig. 11-6 

therefore be said to change the one unperturbed level E 0 into four per- 
turbed levels: E 0 + e u E o + * 2 > E 0 + c 3 , E 0 + c 4 , as indicated qualita- 
tively in the diagram (Fig. 6). 

To find the functions corresponding to the eight roots € we must return 
to equations (96) : 

ai(J — «) + cl 2 K — 0 
cl\K -f- d2 («/ — c) = 0 
d%(J f — «) -J- 0L4K f = 0 
d% K f -|“ d 4 ( % J r — c) = 0 etc. 

On substituting ei for e we find a 2 — — a\ } = a 4 = • • • = as = 0. On 

substituting e — e 2) we find a 2 = ai, = a 4 = • • • = a 3 — 0, and so forth. 



11.34 


QUANTUM MECHANICS 


422 


We thus obtain the set of energies and normalized variation functions given 
in the first two columns of Table 1. 



TAB LB 1 



E 

»A 



E D + J- K 

— ^ 2 ) 

a 

Triplet 

Eo + J+K 

V|(^i -f ^ 2 ) 

8 

Singlet 


| v/ TG/'3 — fa) 

a 

Triplet 

Eo + J'-K' 

I ^4(^5 — fa) 

a 

Triplet 


00 

1 

a 

Triplet 


VT(^3 -j- ^ 4 ) 

8 

Singlet 

E 0 4“ J r H- 

6 As + fa) 

8 

Singlet 


Vf 6 a 7 + fa) 

s 

Singlet 

It now becomes necessary to include the spin into our analysis. 

To do 


this accurately would require a modification of the Hamiltonian operator 
(161), for the magnetic moments of the spinning electrons produce an 
interaction with the magnetic field due to their orbital motions and this 
interaction has not been included in (161). We shall omit this spin-orbit 
interaction and refer the reader to the literature for the more accurate 
treatment. 36 In other words, we shall suppose that the Hamiltonian does 
not act on the spin coordinates. The state function is then separable and 
appears as the product of an orbital (any of the functions in the table) 
and a spin function, and the latter may be taken as an eigenfunction of v z 
for each electron. Let us consider these spin functions more closely. 
For the two electrons, we have four functions: 

a(s 1 )a(s 2 ), a(si)£(s 2 ), P(si)a(s 2 ) } and /3 (si)/3(s 2 ) 

These, however, do not have convenient exchange properties, for when S\ 
and s 2 are interchanged, the first and last remain unaltered, the second 
transforms into the third and the third into the second. But it is possible 
to construct from the second and third two other, equivalent functions, 
which are symmetrical and antisymmetrieal with respect to an exchange of 
spin coordinates. They are, when normalized, V^[a(si)p(s 2 ) + /3(si)<*(s2)] 
and V^[a(si)p(s 2 ) — /3(si)a($ 2 )]. We have in this way obtained four 
spin functions 

- a(si)a(s 2 ), 2 2 = + 0(si)a(s 2 )], ^3 * PMPMl 

A = V'i[a(s 1 )/3(s 2 ) - 0( Sl )a(s2)] (11-168) 

®* Condon, E. U., and Shortley, G. H., “ The Theory of Atomic Spectra,” Macmillan 
Co., New York, 1935. 



423 


EXCITED STATES OF THE HELIUM ATOM 


11.34 


the first three of which are symmetrical, only the last being antisymmetri- 
cal. Furthermore, this set of functions is orthogonal (and complete). 

To include the spin we need only multiply each one of the functions in 
Table 1 by one of the spin functions 2 t to A, a procedure which yields 32 
different functions of position and spin coordinates. But here the exclusion 
principle effects a great simplification. It says that only functions which 
are antisymmetrical when all coordinates, i.e., position and spin coordi- 
nates, of the two electrons are interchanged, are to be permitted. Hence a 
function of Table 1 which is symmetrical can only be combined with A , 
and a function which is antisymmetrical only with Si, 2 2 and S 3 . 

Now the functions marked a in the table are antisymmetric; they can 
be multiplied by any one of the three 2-functions. Each of them corre- 
sponds, therefore, to three states. For this reason the energy states 
Eq -{- — K ; and Eq -{- J — K are said to be triplet states. If spin- 

orbit interaction had been included in our calculation each of these levels 
would have appeared as three closely adjacent levels, while the other 
energies, marked singlets, would have remained single. 

It is true that the functions in Table 1 are only approximate solutions 
of eq. (161). Nevertheless what we have said about their symmetry 
with respect to exchange of electrons may be shown to hold rigorously. 
The structure of the helium energy spectrum, and in particular the singlet- 
triplet character of the states, are therefore correctly given by the simple 
theory of this section; the numerical values of the energy levels will be in 
error. 

The normal state of the helium atom, whose energy was computed 
approximately in sec. 19 of this chapter, is given in the present notation by 
% (1K(2), if we neglect the spin. It is clearly symmetrical and can 
only be multiplied by A when the spins are introduced. Hence it is a 
singlet state. When the helium atom is in a singlet state, its probability 
of passing into a triplet state under emission or absorption of radiation is 
very small, as may be shown by an extension of the methods used in 
sec. 11.28. Hence triplet and singlet levels do not “ combine,” and helium 
may be said to have two distinct spectra, the triplet spectrum to which 
spectroscopists apply the term “ orthohelium ” spectrum, and the singlet 
spectrum called “ parhelium ” spectrum. 

Problem a. Instead of using the 8 functions (166) as linear variation functions, 
start with the 32 functions obtained from (166) by multiplying each of them by Si, 22 , 
23 , A. Show that, if these 32 functions are suitably arranged, the determinantai equa- 
tion is a four-fold repetition of the one obtained above, and that it yields the same 
results in regard to both energies and functions. 

Problem b. The following spin operators for two electrons may be defined: 

<Tt — <7*1 H“ < 7*2 

<7 2 - (*1 + <72) 2 = <7?1 + <7 2 1 + <7*i + <J%L + <r^2 + + 2(<r»i<r*2 + <fy^y2 + « r *l<7*2) 



11.36 


QUANTUM MECHANICS 


424 


where a x i is the operator o x acting on spin coordinate s i, etc. Show that Si, 2 2 , 23 and 
A are all eigenstates with respect to both of these operators, in particular that 

c 1 = 22i, <r 2 2 2 Oj o 3 ~ 22 3 , o x A = 0 
<r 2 2i = 8Si, tr 2 2 2 = 82 2 , <r 2 2 3 = 82 3 , * 2 A = 0 

Are these results consistent with the classical interpretation according to which 2i 
is the state in which both spins are parallel and along Z, 

is the state in which both spins are parallel and perpendicular to Z, 

2 3 is the state in which both spins are parallel and along — Z, 

A is the state in which both spins are opposed and yield no resultant angular momentum? 

11.36. The Hydrogen Molecule. — One of the stumbling blocks of pre- 
quantum chemistry was the phenomenon of homo-polar binding; it is 
impossible to explain on the basis of classical dynamics the union of two 
hydrogen atoms to form a molecule. The only attraction which two 
neutral structures like H-atoms could possibly exhibit was due to quadru- 
pole forces, and these were known to be too weak to account for molecular 
binding. It was shown by Heitler and London that the homo polar bond 
is caused by a typical quantum-mechanical effect: the “exchange” of 
the two electrons. Its meaning will be clear from the following discussion. 

The method of calculation 37 to be employed is a simple one which lays 
little claim to quantitative accuracy 38 but exposes the significant facts in a 
beautiful way. It is similar to the treatment of the H^-ion, from which it 
differs by the presence of two electrons instead of one. The coordinate 
system to be used will be clear from Fig. 7 ; particles 1 and 2 are electrons, 



A and B are the protons whose positions are regarded as fixed. In connec- 
tion with Fig. 7, we also wish to outline the use of a coordinate system and 
a volume element which are very convenient in the numerical work involved 
in this problem. 

The coordinate system for the two electrons will contain the six variables 

Bi> &2> 7*12, Pi, P2> 

I B 2 — B\ | 5s 7*12 5; Bi + J?2, 0 <j?2 < 00 

[ Bi *— R | < A\ < B\ + R, 0 < B\ < 00 

37 Heitler, W., and London, F., Z. Phys. 44 , 455 (1927).. 

38 The most elaborate and accurate calculation, also employing the variational 
method was made by James, H. M., and Coolidge, A. S., J . Chem. Phys . 1, 825 (1933). 



425 


THE HYDROGEN MOLECULE 


11.35 


The volume element dr = dr x dr 2 , where 


Now 

whence 


dr\ = A\dA\ sin 8\ddid<pi 
B\ = A\ + R 2 - 2 A 1 R cos 0! 


2B\dB\ = sin 

On eliminating sin 8\d8\ from dr x by means of this last relation, we find 


dri = — A\dA\B\dB\d(p\ 
K 


The element dr 2 is obtained by writing down an expression similar to dr X) 
but using Bi as base line: 


dr 2 = ” ri2dri2B 2 dB2d(p2 

B i 


Hence the product dr x dr 2 is 

dr - ~ AidAiB 2 dB2ri 2 dri2dBid(pid<p 2 


(11-169) 


Several similar volume elements can be constructed by the same method. 

After this excursion, let us consider the Schrodinger equation of the 
H 2 -problem. It is 


m - {■ L (v! + V|) ~ e2 (i; + k + z + k - n, ~ 1)1* 


= Eyp 


(11-170) 


We endeavor to solve it by the method of linear variation functions, choos- 
ing as constituents of the trial function simple but reasonable approxi- 
mations to the correct \p. If H did not contain the last four items in the 
parenthesis multiplying e 2 it would simply be the sum of two hydrogen- 
atom Hamiltonians, and 

^ = ua(1)ub(2) 

where 

u A { 1) = u B ( 2) = (7ra3)- 1/2 e- fl2/ao 


are hydrogen functions centered about A and B respectively. On the 
other hand, if the terms 1/Ai + l/B 2 — l/ri 2 — 1/R were missing from 
the parenthesis, H would also be the sum of two hydrogen-atom Hamil- 
tonians, but i p - ub (l)tta (2) . Both of these \p’s are equally good approxi- 
mations, and both must be included in the trial function. Note that they 
differ with respect to an exchange of the electrons (or, what amounts in this 



11.36 


QUANTUM MECHANICS 


426 


problem to the same thing, the protons). Hence we adopt 

$ = c\Ua(1)ub(2) + C2Ub(1)ua (2) (ll-170aj 


as variation function in minimizing 


/ 




the process leads to the secular equations 


As explained in sec. 20, 


ci(3Cn - A n E) + capC, 2 - A, aE) - O'! 
C 1 PC 21 ~ A 2 ijE) + £ 2(^22 ~ A 23 ®) = 0 J 
and E is given by 


3 f*n — Ajij E tJCi 2 — A12® 

cX*21 — A 21 E 3C 2 2 A 2 2-^ 


Here 


(11-171) 


(11-172) 


An — J* 'wi(l)^B(2)c?Tidr2 — A 22 = 1 

A ,2 = Ju A (l)UB(2)u B (l)UA(2)dT 1 dT2 = (^f UA(l)UB(l)dTi^ = A 21 


The latter integral is familiar from sec. 21, it is the quantity there called 
&ab- Hence 

A12 = A 2 i = e~ 2p ^1 + p + , p = ™ 

Next, we turn to 

^11 = J UA(l)UB(2)HuA(l)UB(2)dTidT2 


The V a -terms in-# need not be calculated; their effect upon ua( 1) and 
ub( 2) is at once obtainable from the differential equations which these 
functions satisfy: 

— 2^ v 2«b(2) =(eb +£^«b(2) 


In this way we find 



427 


THE HYDROGEN MOLECULE 


11.36 


where 


and 


J = — e 2 j u\{l)u%{2)Bi 1 dT 1 dr 2 = —e 2 (11-173) 


r - e‘f 


■«S( 1 )«I( 2 ) 


r 12 


dr\dr2 


(11-174) 


J is given in sec. 21, eq. (100), and J f has the value 

r = l[ 1_e-2p ( 1+ f p + i p 2 + e)] 

Problem. Prove this result, using the system of coordinates and the volume element 
( 169 ). 

Furthermore, 

c-tT cnr 

JV22 = ^11 

as the reader will easily verify. In a similar way, 


3 Cia == ^2i = 2 E h^\2 + 2Kk l i2 + K f + — A12 


where 


and 


K = -e 2 J M il (l)« B (l)B 1 - 1 dr 1 


K'-Sf 


w^(1)m B (1)wa(2)wb(2) 


r 12 


dridr 2 


(11-175) 

(11-176) 


The value of X is given in eq. (100), and 
„2 

(e 


A' = 


5ao 


+ - [A (7 + In p) - 2V r AA' Ei ( — 2p) + A'#t(-4p)]j 
where 7 = 0.5772 (Euler-Mascheroni constant), 

A = A 12 , A' = e 2p ^1 — p + 

and Ei(x ) is an abbreviation for the exponential integral 

/ * e u 

— du, 

-« u 

which is tabulated and discussed, for instance, in “ Tables of Sine, Cosine 
and Exponential Integrals,” Federal Works Agency, New York, 1940. 



11.36 


QUANTUM MECHANICS 


428 


Problem. Evaluate K f . See in this connection, Sugiura, Y., Z. f. Phys. 46, 484 
(1027). 


The two roots of (172) are 


Ei = 

E 2 = 


tlfn +X 12 „„ , e 2 , 2 J + J' + 2KA m + K' ) 


1 + A 

cir <nr 
Jill ~^12 

1 — A 


= 2 E n + ~ + 


= 2 Eh + - + 

it 


1 -j- A 

2 2 J + J' - 2KA m - K' 


1 - A 


(11-177) 


Substitution into (171) shows that to E\ there corresponds the function 

4-! = [2(1 + A)]- 1/2 [w a (1)u b (2) + u B { l)u x (2)] (11-178) 

and to E 2 the function 

*2 - [2(1 - A)]- 1/2 K(1)u b (2) - u*(1)u a (2)] (11-179) 


The energies E\ and E 2 are plotted against 72, the internuclear distance, in 
Pauling and Wilson. 39 It will be seen that 7?i has a minimum in the 
neighborhood of the experimental internuclear distance of the H 2 -molecule; 
at this minimum E i is negative and equal in order of magnitude to the 
experimentally known minimum which causes the stability of the molecule. 
On the other hand, E 2 is positive for all 72, decreasing in monotone fashion 
with increasing 72. It, therefore, corresponds to repulsion between the 
atoms. Comparison of E\ and E 2 shows the difference in their behavior as 
functions of 72 to be predominantly due to the presence of the K and K' 
integrals. These would have been missing if electron exchange had not 
been taken account of by introducing the two functions constituting the 
of eq. (170). In that case also, there would have been only one energy and 
not two. Now while (170) may be a crude approximation, the fact that 
two equivalent functions, differing only with respect to electron exchange, 
will compose the correct solution of (170) is beyond doubt, hence the quali- 
tative aspects here obtained cannot be questioned. The integrals K and 
K r are called excharge integrals. 

Let us now include the spin and apply the Pauli principle. The spin 
functions are those already encountered in the helium problem, eq. (168). 
If the resultant function is to be antisymmetrical, 4>i, which is symmetrical 
in the position coordinates of electrons 1 and 2 must be multiplied by an 
antisymmetrical function of the spins, of which there is only one, namely A. 
However, 4> 2 may be multiplied by one of the three functions Si, 2 2 or S 3 . 
It represents a triplet state while is a singlet. 

To the energy E 2) therefore, there correspond three times as many 
quantum mechanical states as to E\. From this fact may be drawn the 


Loc. cit., p. 344. 



429 


THE HYDROGEN MOLECULE 


11.36 


conclusion that when two H-atoms approach they will, ceteris paribus, be 
three times as likely to repel as to attract each other. 

REFERENCES 

To begin with source material, there are: Schrodinger’s charming volume “ Wave 
Mechanics ” (Blackie and Son, London, 1928) which is a collection of his epoch-making 
papers of 1926 and 1927; Heisenberg’s more popular “ The Physical Principles of the 
Quantum Theory” (Chicago University Press, Chicago, 1930); De Broglie and Bril- 
louin’s “ Selected Papers on Wave Mechanics ” (Blackie and Sons, London, 1928); and 
Born and Jordan’s “ Elementare Quail tonmechanik ” (J. Springer, Berlin, 1930). The 
foundations of the subject, both mathematical and philosophical, are treated most thor- 
oughly but also most abstractly by Dirac in his “ Principles of Quantum Mechanics” 
(Clarendon Press, Oxford, Third Edition, 1947) and by J. v. Neumann in “ Mathe- 
matische Grundlagen der Quantenmechanik ” (J. Springer, Berlin, 1932). 

General treatises are: 

Condon, E. U., and Morse, P. M., “ Quantum Mechanics,” McGraw-Hill Book Co., 
Inc., New York, 1929. 

Ruark, A. E., and Urey, H. C., “ Atoms, Molecules, and Quanta,” McGraw-Hill Book 
Co., Inc., New York, 1930. 

De Broglie, L., “ Thforie de la Quantification,” Hermann et Cie, Paris, 1932. 

Frenkel, J., “ Wave Mechanics,” Vols. I and II, Clarendon Press, Oxford, 1932, 1934. 

11 Handbuch der Physik,” Vol. XXIV, Parts I and II (numerous authors), Julius 
Springer, Berlin, 1933. 

Pauling, L., and Wilson, E. B., “ Introduction to Quantum Mechanics,” McGraw-Hill 
Book Co., Inc., New York, 1935. 

Jordan, P., “ Anschauliehe Quantenmechanik,” J. Springer, Berlin, 1936. 

Kemble, E. C., ” The Fundamental Principles of Quantum Mechanics,” McGraw-Hill 
Book Co., Inc., New York, 1937. 

Dushman, S., “ Elements of Quantum Mechanics,” John Wiley and Sons, Inc., New 
York, 1938. 

Sommerfeld, A., “ Atombau und Spektrallinien,” Vol. II, Vieweg und Sohn, Braun- 
schweig, 1939. 

Rojanski, V., “ Introductory Quantum Mechanics,” Prentice-Hall, Inc.; New York, 
1939. 

Mott, N. F., and Sneddon, I. N., “ Wave Mechanics and its Applications ” Oxford Press, 
1948. 

Schiff, L. I., “ Quantum Mechanics,” McGraw-Hill Book Co., Inc., New York, 1949. 
Bohm, D., tc Quantum Theory,” Prentice-Hall, Inc., New York, 1951. 

Slater, J. C., “ Quantum Theory of Matter,” McGraw-Hill Book Co,, Inc., New York, 
1951. 

Houston, W. V., “ Principles of Quantum Mechanics,” McGraw-Hill Book Co., Inc., 
New York, 1951. 

Land6, A., “ Quantum Mechanics,” Pitman Publishing Corp., New York, 1951. 

A list of books in which quantum mechanics is applied to special problems follows. 

Van Vleck, J. H., “ The Theory of Electric and Magnetic Susceptibilities,” Clarendon 
Press, Oxford, 1932. 

Condon, E. U., and Shortley, G. H., “ The Theory of Atomic Spectra,” The Macmillan 
Co., New York, 1935. 



QUANTUM MECHANICS 


430 


Seitz, F., “ Modern Theory of Solids,” McGraw-Hill Book Co., Inc., New York, 1940. 

^aiding, L., “ The Nature of the Chemical Bond,” Cornell University Press, Ithaca, 
N. Y., 1940. 

Eyring, H., Walter, J. and Kimball, G. E., “ Quantum Chemistry,” John Wiley and 
Sons, Inc., New York, 1944. 

Glasstone, S., “ Theoretical Chemistry,” D. Van Nostrand Co., Inc., New York, 1944. 

Fltigge, S., and Marschall, H., “ Rechenmethoden der Quantentheorie,” Part 1, Springer, 
Berlin, 1947. This book contains many problems on elementary quantum me- 
chanics and the details of the solution for each. 

Mott, N. F., and Massey, H. S. W., “ The Theory of Atomic Collisions,” Second Edition, 
Clarendon Press, Oxford, 1948. 

Corson, E. M., “ Perturbation Methods in the Quantum Mechanics of n-Electron Sys- 
tems,” Hafner Publishing Co., New York, 1950. 

Coulson, C. A., “ Valence,” Clarendon Press, Oxford, 1952. 

Pitzer, K. S., “ Quantum Chemistry,” Prentice-Hall, Inc., New York, 1953. 

Corson, E. M., “ Introduction to Tensors, Spinors, and Relativistic Wave-Equations,” 
Blackie and Son Ltd., London, 1953. 

Finkelnburg, W., “ Einfiihrung in die Atomphysik,” Third Edition, Springer, Berlin, 
1954. 




CHAPTER 12 
STATISTICAL MECHANICS 


12.1. Permutations and Combinations. — The purpose of the present 
chapter is not primarily an exposition of the ideas of statistical mechanics, 
which is available in several modern texts, 1 but a brief and summary review 
of the chief analytical techniques used in the treatment of this subject. 
We begin by discussing the principal formulas of the theory of combinations. 

a. The number of possible permutations of n different (distinguishable) 
objects is n\ 

The proof is simple: the first object can be put in n different positions. 
When its place is fixed, n — 1 different positions are left open for the sec- 
ond. Hence these two objects can be arranged in n(n — 1) different ways 
without disturbing the relative order of the remaining (n — 2) objects. 
But the third can occupy n — 2 different places, and so on. The total 
number of possible arrangements is therefore n(n — 1) (n — 2) • • • 2 = n\ 


b. Suppose we wish to arrange the n objects in r piles, the number in 
each pile being prescribed. Let the number of objects in the first pile be 

r 

n i, that in the second n 2 , etc., so that £ — n. It is desired to find the 

t=i 

number, Af, of possible arrangements of this kind. If M is multiplied by 
the number of possible permutations of all objects in the first pile, then by 
the number of possible permutations of the objects in the second pile and 
so on for all the piles, we must obtain the total number of permutations of 
n objects. Thus 

Mni\n 2 \ * • • n T \ — n! 


whence 


M * 


n\ 

n\ \ri 2 ! • ■ • n r ! 


( 12 - 1 ) 


There is another combinatorial problem which leads to the same result. 
Suppose the n objects fall into r classes, the objects in each class being alike 


1 Tolman, R. C., “ The Principles of Statistical Mechanics, ” Clarendon Prees, 
Oxford, 1938. Chapman, S. and Cowling, T. G., “ The Mathematical Theory of Non- 
Uniform Gases, M University Preps, Cambridge, 1939. Mayer, J. E. and Mayer, M. G., 
“ Statistical Mechanics, ” John Wiley and Sons, 1940. Lindsay, R. B., “ Physical Sta- 
tistics/’ John Wiley and Sons, 1941. 


431 



12.1 


STATISTICAL MECHANICS 


432 


(indistinguishable). Let the first class contain ni objects, the second n 2 , 
etc. The number of possible distinguishable arrangements of the n objects 
will then be seen to be obtainable by the reasoning employed above. 
Hence M represents also the number of arrangements of n things groupable 
into r classes, the members of each class being alike. 

c. The number of ways in which m objects can be selected from a set of 
n objects is n!/[m!(n — m)!]. This follows at once from (1), for a with- 
drawal of m individuals is equivalent to an arrangement of the n objects 
into two piles, one containing m, the other (n — m) objects. We note that 
this number 

?z! 

m!(ft — m)\ 



It is often referred to as the number of combinations of n things taken m at a 


time. 


We observe that, since 



it is equal to the number 


of combinations of n things taken n — m at a time. 

Eq. (2) also provides the answer to another, apparently different ques- 
tion. Assume that we have n boxes, and a smaller number, m, of indis- 
tinguishable objects to be placed in them in such a way that no box con- 
tains more than one object. The number of ways in which this can be 
done is given by (2), for the assignment of m objects to n boxes is entirely 
equivalent to the selection of m objects from a set of n objects. 


d. When in accordance with theorem (c), a certain selection of m 
objects has been made, a permutation among these m objects does not 
produce a new combination. It does, however, produce a new arrange- 
ment. Thus, to every combination given by eq. (2), there correspond m! 
arrangements of the m objects. The total number of arrangements of n 
things taken m at a time is therefore 


o= A 


(12-3) 


If, in the problem of placing m objects into n boxes (ft ^ m) discussed 
in (c), the objects are assumed to be distinguishable , so that our interest is 
no longer merely in the individual boxes each of which contains an object, 
but also in the arrangement of the individual objects placed in them, 
eq. (3) is applicable. It expresses the number of ways in which m dis- 
tinguishable objects can be placed in n boxes, zero or one object per box. 

e. Let us now determine the number of ways in which m indistinguish- 
able particles can be put into n boxes. Suppose that the m particles were 
placed along a line in any manner whatever and that (n — 1) partitions 
were used to separate the particles. If one more partition were then placed 



433 


BINOMIAL COEFFICIENTS 


12.2 


at the end of the line, the particles could be regarded as having been placed 
into n boxes. If, therefore, we consider the m particles and the (n — 1) 
partitions, visualized as walls, as a set of (m + n — 1) objects, our problem 
becomes tantamount to finding the number of ways in which (n — 1) walls 
can be arranged among the totality of (m + n — 1) objects. This num- 
ber, from sec. c and eq. (2) is 


( n + m — 1\ /n + m — l\ 

n - 1 / \ m / 


(12-4) 


The preceding result is obtainable in several other ways, among which 
the following is sometimes given. Suppose that there are n boxes and m 
objects, as before. The first box can be selected in n ways, leaving 
(n + m — 1) boxes and objects which can be arranged in {n 4- m — 1)! 
ways or a total number of n(n + m — 1)! arrangements. However, per- 
mutations of boxes or particles among themselves do not correspond to 
recognizably different arrangements. Since this last number is n\m\, the 
desired number is again given by eq. (4). 

In the mathematical literature, the result of eq. (4) is sometimes known 
as the number of “combinations with repetitions.’ ’ We note that it equals 
the number of combinations of (n + m — 1) things taken m at a time, 
where repetitions are not allowed. 

A recursion formula for the case with repetition is sometimes useful. 
If there are three objects, taken two at a time, it is found that there are 
six possibilities: ( aa , ab , ac , 56, 5c, cc); if taken three at a time, there are 
ten cases: ( aaa , aab , aac , abb } acc, bbb, bbc , bcc , ccc). By mathematical 
induction, it is easy to show that for n objects taken m at a time 

?i 4- m — 1 

C m (n) = C m _ i ( n ) (12-5) 

m 


If m is given the successive values I, 2, 3, * •, k and the equations multi- 
plied together, the result is the now familiar one of eq. (4). 

/. The number of ways in which m distinguishable objects may be placed 
in n boxes is clearly n m for the first object can be put into n places; with 
each of these dispositions of the first object can be combined n dispositions 
of the second object, and so on. 


12 . 2 . 


Binomial Coefficients. — The coefficients 



appear in Newton’s 


famous binomial expansion 


(a + b) n 



a £ 5 n - t 


( 12 - 6 ) 



12.2 


STATISTICAL MECHANICS 


434 


where t is an integer. Its proof is fairly obvious, since the number of ways 
in which t factors a and (n — t) factors b can be selected from n factors 

(a + 6) is ways, by virtue of (2). We note that 

(:)-• 0- 


also 


C)-° 


t > n, this because (n — t ) ! = <x> 


Most of the relations to be studied here are valid for non-integral values of 
n provided we define 


/x\ x(x — 1) * • • (x 

W" 


"" t + 1 ) 


An important series in binomial coefficients may be obtained as follows. 

( Tl j 

) is the coefficient of a r b n+i ~ r in the expansion of 

(a + b) n+k . But 

<»+ 6 )- ( . + 6 ,* - [£(;)^-][£C)^*-] 

■ 0 (r) 

The coefficient of a r & n+fc ~ r in this double sum is ohtained by putting 
t + s — r and summing over t. Hence 


(THOC*,) 


( 12 - 7 ) 


This is known as the addition theorem of the binomial coefficients. 
From it, numerous other relations can be derived. 

On putting /c = 1, we have 


If k — r = n, 


(*r)-c) + C-0 

OsGX’J-sG*/ 



435 


ELEMENTS OF PROBABILITY THEORY 


12.3 


If we observe that 

OHCT 1 ) 

we may also put k = —1 in (7), obtaining 

If, in Newton's formula, we let a = b = 1 , we find 



but if a = —6, 

12.3. Elements of Probability Theory. — An aggregate of elements , such 
as a set of observations, a sequence of results of some operation (e.g., 
throwing a die), is called a 'probability aggregate if it is permissible to apply 
the rules of the probability calculus to the aggregate. Whether or not this 
application is proper is usually decided on the basis of intuition : it seems 
clear that the decimal expansion of the fraction \ does not form an aggre- 
gate of digits to which probability considerations may validly be applied; 
on the other hand, no hesitation is felt in subjecting the outcome of a series 
of throws of a die to probability reasoning. In the former sequence 
(.142857142857, etc.) the digits occur with too much regularity to be 
regarded as “ distributed at random." The criteria for randomness, 
which decide whether an aggregate is a probability aggregate, may be 
stated with considerable precision 2 but will be omitted here. 

Every element is regarded as having one of a number, s, of distinguish- 
able properties. (Each throw of a die is an element, the number appearing 
uppermost is a property; s = 6. In measuring a physical quantity, each 
measurement is an element, each measured value a property; s may be 
infinite in this example.) If is the number of times the z-th property 
occurs and n the total number of elements, 

Ui 

n 

is defined as the relative frequency of the z-th property. By the probability 
of the z-th property is meant the limit 

lim — = Wi (12-8) 

n — >°° n 

2 See, Lindsay, B. B., and Margenau, H., “ Foundations of Physics,” John Wiley 
and Sons, 1936. 



12.3 


STATISTICAL MECHANICS 


436 


The existence of this limit is a matter which has given rise to considerable 
discussion; it will here be assumed. 3 The totality of the W{ is called the 
distribution of the probability aggregate. Obviously, = 1. 

t 

The properties may be discrete (throwing dice) or continuous (value of a 
physical quantity, such as position of a particle). In the former case the 
distribution is sometimes said to be arithmetical , in the latter case, geomet- 
rical. In the continuous case a different formulation of probability is more 
convenient. Let x denote the continuous property. The probability 
w x , defined by (8) is clearly zero, but the probability that x shall lie between 
x and x + Ax is finite and is, moreover, usually proportional to the range 
Ax provided this range is sufficiently small. Hence we may write for this 
probability 

w(x) Ax 

and the function w(x), which does not have the physical dimension of a 
probability (a pure number) is called the 'probability density. Clearly, 

f w(x)dx — 1 


if the integral is taken over the entire range of properties. 

When a distribution w i: or w(x) is given, certain expressions frequently 
occurring in statistical theories can be calculated. We present the most 
important of these, using parallel formulations for the arithmetical and 
geometrical cases. To make this possible, we write w(x{) for the former 
Wi, thus letting X{ represent the i-th property. 

If f{x) is a function defined for every z* (or x) which has a non-vanish- 
ing probability, the mean of f(z) with respect to the distribution w(x) is 
given by 


7 = 


TS(Xi)w(Xi) 

i 

f , f(x)w(x)dx 


(12-9) 


The dispersion of f(x) with respect to w(x) is defined by 


D(f) 


LI /(*<) -TlMxd 

i 

f U(x) -f] 2 w(x)dx 


(12-10) 


On taking for the function f(x) the variable x itself there results 

T,X{W(Xi) 

— ^ i 

J xw(x)dx 

3 For further remarks see Lindsay and Margenau, loc. cit., Chapter 4. 


( 12 - 11 ) 



437 

and 


ELEMENTS OF PROBABILITY THEORY 


12.3 


D(x) m <r 2 = 


( 12 - 12 ) 


- x) 2 w(Xi ) 
i 

\ J* ( x ~~ (x)dx 

The quantity a 2 is called the dispersion of the distribution w(x), <r is 
known as the standard deviation. As is clear from its definition, <r is a 
measure of the spread of w(x) about its mean. If w(x) were regarded as a 
distribution of mass, a would represent its radius of gyration. By the 
r-th moment of the distribution is meant the quantity 

I y<w(xi) 


J* x r w(x)dx 


For distributions with an infinite range of properties, higher moments do 
not always exist. The dispersion of w(x) may be expressed in terms of its 
first and second moments. In view of (12), 

a 2 = x 2 — 2x 2 + x 2 - x 2 — x 2 

Under certain conditions it is possible to expand a geometrical distribu- 
tion in terms of its moments, provided these exist. For simplicity we shall 
take these moments about x as origin, so that x = 0, x 2 = <r 2 , etc. One 
can then prove 4 that 


w(x) 


:e 


-•*2/2*2 


<rV2r 

where H{ is the i-th Hermite polynomial, and 

s 3 o x 3 

Cs 3 , C4 4 3, C$ 5 10 ■ 


KiHf)) 


p 3 > 


eg = —q — 15 “4 "f* 30 

cr cr 


<T” CT CT (T 

This expansion is particularly useful when w(x) does not depart too 
greatly from a normal “ Gauss ” distribution: w(x) — e~ xV2<r 


'jaVtor. 


Problems. Two geometrical distributions of considerable interest in phyBics and 
chemistry are 

h 


wi(x) 


v;' 


-Z»2(I— fl)2 


W2(x) = - 


7T a 2 + x 2 


4 See Zeraike, F., “ Handbuch der Physik,” Vol. Ill, J. Springer, 1928, p. 448. 



12.4 


STATISTICAL MECHANICS 


438 


a. Show that, for w\ f x~a and o- 2 = 1 /2ft 2 . 

b. Show that the r-th moment of tui is 1 • 3 • 5 • • • (r — l)/2 r/2 ft r if r is even; for 
odd r it is zero. (Take a = 0). All moments of wy are finite. 

c. Show that, for w 2 , all odd moments are zero and no even moment (except the 
zero-th) exists. 


12.4. Special Distributions. — A problem which is basic in statistical 
mechanics and in the theory of errors will here be discussed in some detail. 
It is of considerable historical interest, its solution being connected with the 
names of Newton, Bernoulli, Laplace, Poisson and Gauss. Consider n 
boxes, each containing P black balls and Q white balls. We wish to find 
the probability w n (m) y that in drawing one ball from each of the n boxes, 
m of them will be white. 

The probability of drawing a black ball from a given box is clearly 
P/(P + Q) = p, that of drawing a white ball is Q/(P + Q) = q. Thus 
i£>i(0) = V > v 'i (!) = q- If n = 2, the probability aggregate has the 
following properties: bb , bw , wb , ww (b = black, w = white), and these 
occur with the probabilities p 2 y pq, pq y q 2 ; hence ^ 2 ( 0 ) = p 2 , w 2 {l) = 
2 pq, w 2 ( 2) = q 2 . In general, the probability that m white balls will be 
drawn from n specified boxes and n — m black ones from the remaining 
boxes will be 


But in view of eq. (2) there are ^ ways of selecting m boxes from a 

total number of n boxes. Hence the answer to the problem, first found by 
Newton, is 


w n {rn) = 


p n ^nq m 


(12-13) 


It- is clear from (6) that 


L w n (m) 

m =0 


since q + p = 1. Eq. (13) has of course a more general significance than 
the one here particularized : it represents the probability of m successes in 
n independent trials if the probability of success in a single trial is q . 

To calculate the mean of m and the dispersion of the arithmetical dis- 
tribution w n (m) we consider the identity 

n 

(v + qy) n = L W n (m)y m 

m =0 


where y is a variable. On differentiation with respect to y this reads 


n(v + qy) n l q = 


n 


£ mw n (m)y™ 1 

m =0 


(12-14) 



439 


SPECIAL DISTRIBUTIONS 


12.4 


When we let y = 1 in this equation, the right hand side becomes m, so that 

m = nq (12-15) 

The mean number of successes i.s equal to the probability of success in a 
single trial, multiplied by the number of trials. 

To find the dispersion, we differentiate (14) once more, and then set 
y = 1. The result is: 

Tl ___ 

n(n — 1 )g 2 = £ m ( m — 1 )w„(m) = m 2 — m 

171—0 


To obtain the dispersion we must add to the right hand side the quantity 
m — m 2 which, according to (15), equals nq — n 2 q 2 . Hence 

<r 2 = m 2 - m 2 - ??,< 7(1 - q) = nqp (12-16) 


Especially interesting is the case where q<.p, so that p ~ 1. For then 
the dispersion is numerically equal to the mean number of successes, a cri- 
terion which can sometimes be used to determine whether the successes are 
due entirely to chance. For applications of the formulas here developed, 
particularly to the case of radioactive emission, the reader is referred to 
Lindsay’s Physical Statistics. (See also the problem of the random walk 
at the end of this section.) 

For large values of n and m expression (13) is difficult to use because of 
the inconvenience in dealing with factorials of large numbers. We shall 
now prove that in this case w n (m) can be approximated by the Gauss error 
law. Let us first see what happens to w n (ra) as n — > oo . It is clear from 
(15) and (16) that both m and <r 2 tend to infinity, that is to say, if we were 
to plot w n (m) against m ) the mean (which for sufficiently large m is also 
the maximum of w n (m)) would move outward from the origin and the 
distribution would broaden out indefinitely. However, the quantity x y 
defined as the deviation from the mean and measured on a proper scale 
which contracts as n increases, namely 


x 


m — m 

~vT 


(12-17) 


will remain finite. We shall try to convert w n (m) into w(x), assuming that 
n co. 

First compute 

In w n (m) = In nl — In ml — In (n — m) ! + (n — m) In p + m In q 


Now by Stirling’s formula, which is valid for large numbers, 


In n\ = (n + \) In n - n + J In 2 ir + 


12n 


^terms of order n 3 


) 



12.4 


STATISTICAL MECHANICS 


440 


Hence 5 

, , , i , 2irm(n - m) , f m , , _ n — rn 

— urn In u; n (m) = o In (-mm (- (n — m) In 

*-+ «. n ' np 

( 12 - 18 ) 

In view of (17) and (15) 

When these expressions are introduced in (18) there results 

1 x* 1 

-In w(x) = | In 2impq +- f- - — 

Z q Z p 

provided we use the expansion of the logarithm 

x ^ 

In (1 + x) = x - — 4 

and retain no terms in negative powers of n. Thus we have, since 
V + £ = 

—In ^(x) — 4 In 2Trnpq H 


whence 


te(rc) — 


\^2wnpq 

When written again in terms of m it is 


6 — *V2 PQ 


(12-19) 


lim w n (m) = -7=— e ^m^a>s /2 psg = e -< m -R)V*r* (12-20) 

»->» V27T pm VZTtf 

These results have a special significance with respect to errors of 
measurement, as can be seen from the following (oversimplified) argument. 
Suppose that the true value of a measured quantity is ^4, but that there 
are n causes of error, each of which will add to A the amount A A or — A A 
with equal probability. If m of these n causes contribute A A then the 
resulting error is rAA — [m — (n — m)]AA, and therefore the probability 
of this error is w n (m) with m = (n + r)/2. For large n the distribution 
of errors is then given by (20) : 6 

w(r) = - e -(.r-r) 2 /8a* 

V 2 TCT 

6 In arriving at this result, it is convenient to add to the literal expansion of the 
logarithm by Stirling’s formula the quantity (m In n — m In n). 

6 Note however, that cr 2 is no longer the dispersion with respect to the r-distribution. 

Furthermore, f w(r)dr 



441 


SPECIAL DISTRIBUTIONS 


12.4 


If p = q — ^ wi ” n /2 and r — 0. If we denote rAJ. by e r , Gauss 7 error 
law 

w(e r ) = const, e - * 2 ^ 

immediately results; the constant h, which depends on A A and is always 
determined empirically, is often called the l< measure or index of precision.” 

In the analysis leading to (19) quantities of the order l/nq and l/npq 
were neglected, the assumption being that p and q are numbers not greatly 
different from unity. Under these conditions the mean of m, nq , is a large 
number. This, then, is a criterion for the applicability of eq. (19). 

It may happen, however, that q is small, so small indeed that nq is of 
order unity in a given application. In this case the distribution (13) has, 
to be sure, spread out indefinitely (n °o ) but the mean has remained 
small; the resulting distribution is quite asymmetrical. To deal with this 
situation we put 


a a 




Problem a. Plot w n (m) for q = -g-; n = 5, 10, 50. Observe the change from an 
asymmetrical to a symmetrical distribution. Compare Poisson’s formula with the 
plot for n = 5. 



12.6 


STATISTICAL MECHANICS 


442 


Problem b. “ Random walk” A person, making steps of length is just as likely to 
step forwards as backwards (p - q - |). Prove that, after taking n steps, he will 
have gone forward a distance rl with a probability 



Show also that f = 0> r 2 = n. 

12.6. Gibbsian Ensembles. — It is the main purpose of statistical 
mechanics to provide a formalism by means of which the facts of thermo- 
dynamics (cf. Chapter 1) can be deduced. This may be done in several 
different ways, that is, on the basis of several distinct sets of fundamental 
axioms. Two of these stand out for their success and clarity. One, the 
system of Gibbs, is particularly suited to a development of the classical 
laws of thermodynamics, i.e., those relations whose understanding is 
possible without the use of quantum mechanics. Gibbs 7 statistical mechan- 
ics will be summarized in this section and the next. The remainder of the 
present chapter will be devoted to the method of Darwin and Fowler with 
the aid of which the subject of quantum statistics is most satisfactorily 
discussed. 

The central concept of Gibbs 77 theory is the ensemble, the meaning of 
which will now be discussed. Statistical mechanics deals with certain 
properties of physical objects, as for instance a given body of gas, or liquid, 
or a solid. Such an object will be called a system, or, more specifically, a 
thermodynamic system. If it has n degrees of freedom, then its complete 
mechanical state can be specified in terms of n generalized coordinates and 
n generalized momenta, a total of 2 n numbers. Mathematically, these 2 n 
numbers may be said to define a point in a space of 2 n dimensions, and this 
space is called the phase space of the system. At any instant of time, the 
system is represented by one point in its phase space, and in the course of 
time, this point will move, describing a certain trajectory in phase space. 
When the position of the representative point at any instant is known, it is 
theoretically possible by the laws of mechanics to calculate its position at 
any other time, but such a prediction is practically not feasible. Other, 
less detailed methods of description must be chosen. 

In the simplest instance of a thermodynamic system, the ideal gas con- 
sisting of v molecules, n = 3r, and the phase space has 6v dimensions. A 
representative point would correspond to an exact assignment of 3 com- 

7 There is no more lucid and careful exposition of J. W. Gibbs’ ideas than his own, 
“Elementary Principles of Statistical Mechanics”; C. Scribner’s Sons, 1902; 
Collected Works, vol. II, Longmans, Green & Co., 1928, New York. See also 
“ Commentary on the Scientific Writings of J. Willard Gibbs,” Yale University Press, 
1936. 



443 


GIBBSIAN ENSEMBLES 


12.6 


ponents of momentum and position to each of the v molecules, and the 
path of the point would portray the changes which the values of all these 
quantities undergo in time. In this case, another picture is often useful. 
One may regard the phase space of the system as being composed of v 
subspaces, one for each molecule. Such a subspace is called a p-space 
(“ molecule space ”) in order to distinguish it from the entire phase space 
which is often designated as 7-space (“ gas space ”). In the case of mole- 
cules regarded as mass points, p-space has 6 dimensions, although in general 
for molecules having internal degrees of freedom the number of dimensions 
is greater. Use of the /x-space is often very convenient, but it loses its 
significance except as an approximate description when strong interaction 
exists between the molecules. 

We shall denote the n generalized coordinates of our system by 
<71^2 • * * q n j the generalized momenta by p\ • • • p n , Out of these we 
construct an element d<f> of phase space in which the p’s and q’s are taken as 
Cartesian coordinates: 

d<t> = dq\dq 2 • • • dq n dp\dp 2 • • * dp n 
It is possible to show 8 that any point transformation 

Qi = q< (ffi • • • ?») 

Vi = Vi (31 • • • Vl • • • Vn) 
leaves d</> invariant; thus 

d<f> = dq{ ■ ■ ■ dq' n dv'i ■ ■ ■ dp' n 

Since the system is assumed to obey mechanical laws, Hamilton’s canonical 
equations must be valid (cf . 9-13) : 

dH . dH , 

Pi = ~ — , 3< = — , t = 1. 2, • • • n (12-22) 

dqi dpi 

From these equations it follows at once that through every point in phase 
space there passes but one trajectory; for when every p and q is given, 
equations (22) determine uniquely the rate of change of every coordinate in 
phase space. Hence the representative point can never cross its previous 
path. Whether the motion of the point will ultimately carry it through all 
regions of phase space has not been proved completely; such behavior is 
tentatively asserted by the so-called ergodic hypothesis 9 which, however, is 
not needed in Gibbs’ formulation of statistical theory. 

It would seem that the values of thermodynamic quantities such as 
temperature, pressure, etc., could be regarded as time averages over the 

8 See Gibbs, loc. cit. or Lindsay, loc. cit. 

9 See Tolman, loc. cit. 



12.6 


STATISTICAL MECHANICS 


444 


motion of the representative point of the system in its own phase space. 
The development of this conjecture, however, is fraught with rather for- 
midable difficulties and is not usually attempted. Instead, Gibbs intro- 
duces what he calls an ensemble of systems, by which is meant a very 
large set of imagined replicas of the one real system under consideration. 
These systems are not in identical states, but the state of each is repre- 
sented by a phase point in its own phase space. Since all imaginary 
systems of the ensemble are similar as to number of molecular constituents 
and Hamiltonian function, all points can be plotted in the same phase space, 
in which they will be distributed with a certain density, D. 

This density will in general be different in different parts of phase space, 
and it will change in time. Hence 

D = D('pi • • • p n )Q\ * * * q n ;t) 


Nothing has as yet been said about the initial distribution of points in 
phase space which, in view of the meaning of the ensemble, is quite arbi- 
trary. Whatever the functional form of D, we must require 



for every t 


if N is the (very large) number of systems in the ensemble, 
ent also to introduce a f< probability of phase ” 


such that 



It is conveni- 


(12-23) 


12.6. Ensembles and Thermodynamics. — By virtue of Liouville’s 
theorem, proved in almost all books on statistical mechanics (also known 
as the principle of conservation of density-in-phase, a name due to Gibbs), 
the representative points move in phase space as though they constituted 
an incompressible fluid of varying density. A group of points filling a 
certain region of phase space at a time to can neither contract nor expand 
during its motion; it will continue to occupy the same volume but with 
altered shape. Mathematically these statements are expressed as follows : 



* dD . * dD 

£ T“~ V% + £ ~ 

i « 1 dpi i = 1 OQi 


ii = 0 


(12-24) 


Thus it is seen that phase space possesses no intrinsic property of accumu- 
lating phase points in some regions or not admitting them to others; 
Liouville’s theorem shows phase space to be indifferent to the motion of the 



445 


ENSEMBLES AND THERMODYNAMICS 


12*6 


points. This fact suggests the following fundamental postulate, by means 
of which contact is established between an ensemble and thermodynamic 
experience: 

The probability that, at any instant t ) a given real system be found in 
the state characterized by q\ • • • q n , p\ • * * p n is the same as the probability 
P(pi * * * Vn)q\ * * • q n ;t) that a system selected at random from the corre- 
sponding ensemble shall have the phase q x • • * q n , Pi ■ • • Pn at the instant t. 
The probability that the values of the p’s and q * s shall lie within a small 
extension of phase A $ is proportional to A<f>. We are thus attributing equal 
intrinsic probabilities to equal volumes of phase space, a procedure sug- 
gested, though not made necessary, by Liouville’s principle. 

In accordance with this postulate we may calculate mean values of 
dynamical quantities of the real system by computing mean values over 
the individuals composing the ensemble. If R is such a quantity, expressi- 
ble as a function of momenta and coordinates, then 

R{t) = J R(pi ■ ■ ■ p n ]qi ■ ■ ■ q n )P(Pi • • • Pniqi • • • q n ,t)d<j> (12-25) 

And by R (t) is meant in general the expected mean value of the quantity R 
which would be obtained when R is actually measured at the time t. It 
can be shown that deviations from this expected mean are extremely small 
when the system in question has many degrees of freedom, so that the 
expected mean may be identified for practical purposes with the value of R 
actually measured in a single observation. Moreover, we shall see at once 
that under equilibiium conditions P is not a function of t , so that R } also, 
will not be a function of t. One may then think of R as the mean value 

— P T Rdt 

of the quantity R in a temporal sense, i.e., R = I — * for sufficiently 

*/ o I 

large T, without violating the spirit of the postulate. 

If the thermodynamic system is in equilibrium, the number of repre- 
sentative points in any given extension in phase, A0, must rqmain constant 
in time. The condition of equilibrium may therefore be stated in the form 



For the equilibrium case, in which we are chiefly interested (a reversible 
thermodynamic change consists of a sequence of equilibrium states), 
Liouville’s theorem states 



(12-26) 


Let us now give thought to the initial form of 
D(pi • • • p n \q i • • • g n ). We know that if it satisfies eq. 


the function 
(26), then it 



12.6 


STATISTICAL MECHANICS 


44 G 

implies D (t) = 0 and corresponds to an equilibrium condition of the ther- 
modynamic system. Hence D will forever be independent of t. But we 
note that if we put D — D(H) } where H is the Hamiltonian function of 
the system, D will certainly satisfy eq. (26) ; for the left-hand side of that 
equation will read 



and this vanishes because of (22). Hence we take D = D(H). 

Further restrictions on D cannot be imposed on the basis of mechanical 
or statistical reasoning, except for the obvious facts that D must be every- 
where positive and must satisfy J Ddcfr = N. However, the choice of the 

function must be such as to lead to the thermodynamic formulas when 
thermodynamic quantities are computed by eq. (25). The important 
choices by which this success can be achieved, as Gibbs has shown, are 
these: 

D ( H ) = const, when E 0 < H < E 0 + A E 

D(H) — 0 for all other values of H 

D(H) = Ce~ Hie (12-28) 

where C and 0 are positive constants. The first is called the microcanoni - 
cal or energy shell ensemble, the second the canonical ensemble. The energy 
shell ensemble seems most reasonable from the physical point of view, for a 
system in equilibrium is one of fixed total energy, i.e., fixed within an 
interval of error A E, and systems not having an energy within this range 
are excluded from consideration. However, the canonical ensemble, 
although it assigns a finite density to points corresponding to those mem- 
bers of the ensemble which do not satisfy the requirement of constant 
energy, also leads to the correct thermodynamic relations. Since it is 
mathematically easier to handle, it enjoys greater popularity than the 
former, and was indeed preferred by Gibbs. 

The connection between the two types of ensemble may be exhibited in 
the following way. Consider a gas whose phase density in 7-space is 
represented by a microcanonical ensemble. Let it consist of molecules 
with ju-spaces, mi, M2, etc., with probability distribution Pi in space /u*. 
Denote the element of extension in fx{ by dfa. Since energy exchanges may 
take place between the molecules, Pi cannot be represented by a micro- 
canonical distribution; it must indeed be finite for all energies, H of the 
i-th molecule. Nevertheless, the probability that molecule 1 be within 
the element d<t>\ of its M-space, molecule 2 within dfo of its /x- space, etc.. 


(12-27) 



447 


ENSEMBLES AND THERMODYNAMICS 


12.6 


simultaneously, equals the probability that the whole gas be in the element 
d<t> = d(t>\d<j )2 * * ' d<j> v of 7-space. Hence 

Pi{H l )d<j> l P 2 {H 2 )d<t> 2 * * • P v (H v )d<j) v = P(H)d0 (12-29) 

so that 

Pi (Hi) • P 2 (H 2 ) • • • P„(H„) = P{H l +H 2 + .--H 9 ) 

We wish this functional equation to be satisfied for every value of the total 
energy H = £ Hi although, of course, for any given H the constant P(H) 
may be described by the microcanonical distribution. The solution of 
eq. (29) therefore leads to a very natural extension of this distribution. 

Eq. (29) holds for every v. If the gas consisted of only 2 molecules, 

P t (H x ) - P 2 (H 2 ) =P(H 1 + H 2 ) 

Hence it follows that P;( 0) = 1 for every i. If we denote log Pi by/;, we 
have 

+ f 2 (H 2 ) = /(Hi + H 2 ) (12-30) 

On putting H 2 = 0, this reads 

/i(H0 +/a(0) -/(HO 

and since /2(0) = 0,/i = /. Thus all/; are seen to be the same function, /. 
We are thus led to consider the equation 

fix) 4- f(y) = fix 4 y) 

When y is taken equal to x, we have 2f(x) = /( 2x), and so by induction, 

finx ) = nf(x) 

for every integer n. From this relation, 

whence 

and 



where m is another integer. Finally, 

f(x) = f(x • 1) = xf( 1) = const, x 

We have shown that the only function which satisfies eq. (30) is/; = cH;, 
whence 

Pi(Hi) = e cHi 

But P{, being a probability, must remain finite for every H;, a quantity 
which may tend to + °o , though not to — oo . Hence c is a negative con- 



12.7 


STATISTICAL MECHANICS 


448 


stant. Following Gibbs, we write for it — 1/6 , so that finally 

Pi(Hi ) = e- Hild (12-31) 

This defines the canonical ensemble, in accordance with eq. (28). The 
constant C in (28) has been introduced to insure normalization in 7 -space: 

J Pd<t> = 1 ; the functions Pi in (29) are not properly normalized in each 

M-space, as is evident on closer inspection. 


Problem. Consider as system a single particle of mass min a constant gravitational 
held. Note that the microeanonical ensemble is given by Fig. 1 , where all points not 
lying between the two parabolas A and B f corresponding to H = Eq and H ~ 
Eq H- AJ£, have zero density. Show that the group of points lying between pi and 
p 2 at t = 0 , will lie between p{ and P2 at time t, such that p[ = pi -f mgt y p% = pz + rngt. 
Prove also the invariance of the element of phase volume, i.e., area $ 1 = area <b 2. 
(Liouville’s theorem.) 



12.7. Further Considerations Regarding the Canonical Ensemble. — 

As an illustration regarding the use of the canonical ensemble we derive 
the Maxwell law for the distribution of velocities in an ideal gas. In 



449 


CONSIDERATIONS REGARDING THE CANONICAL ENSEMBLE 12.7 


accordance with eq. (28) we put 

Pd<j) = Ce~ HI6 dpidp 2 * • • dp n dqidq 2 * * * dq n 

V 

= Ce~ H(e II dpi X dpiydpi Z dxidyidzi 

i = i 


The constant C must be so chosen as to make 
gas, 


J Pd<f> 


1. For an ideal 


H 


= £ [~ P<» + flfy + PL 
1=1 - 


Pi:/ 

2m 


+ V (x{y, 


1 v 

iViZi) ~ 

J t = l 


where Vix.y^z) is the potential energy of a particle in an external field if 
such a field is present. The probability that particle i have an energy Hi 
corresponding to pi x , pi V) Pi z ; Xi , xji , regardless of the states of all 

other particles is clearly given by the integral J* Pd<t> extended over the 
momenta and positions of all particles except i: 

Pidpi x dpiydpi Z dXidyidzi ~ c / e~ Hife d<t>i (12-32) 

c! being some other constant. This relation, often called the Maxwell- 
Boltzmann law, is really nothing more than eq. (31). When no external 
field V is present, it may be written in more explicit form, for the constant 
cf can then be determined. Since 


jc'e- Hil6 d<t>i= 1 

we have 

hums e (pi +P l +p l )/2me dp x dpydp z dxdyd3 

-'Iff e (P? +Pl +P|)/2m5 dpajdpijdpz 


where r is the volume of the gas. Thus 

1 = ^ J ” e- u2/2me duj 3 = r • (2 m0) 3 ' 2 • tt 3/2 

When (32) is now expressed in terms of velocities instead of momenta and 
an integration over the volume is carried . out on both sides, the result is ’ 

P(v x ,v y ,v z )dv x dvydv z = (2mn£)~ 3l2 e + $ +v * )m/2B d(mv x )d(mv y ) ■ d(mv z ) 
'mV 12 


0 


2 tt 0/ 


e -^l+'’l+»l^ 2s dv x dv v dvi 


(12-33) 



12.7 


STATISTICAL MECHANICS 


450 


To find the meaning of the parameter 0 we compute the mean energy of 
the i-th particle, H t -, which, as we know from simple statistical theory, 
must be equal to %kT. We have 

= - ~mfff ^ + Pv + p 2 t)c'e~ (p * +p l + ^ )/2m6 dp 2 dp t/ dp ! 

- * - CO 

= r(2m) ai2 6 5l2 c' J J j (u 2 + v 2 + w 2 )e~ (u2+vi+wt) dudvdw 
= r(2m) 3/2 e 6/2 c' • fr 3/2 = $0 


If thie is to be equal to %kT, we must put 

6 = kT (12-34) 


Making this substitution in (33) we obtain Maxwell’s law for the dis- 
tribution of velocities 


P(v x ,v y ,v z )dv x dv u dv z 



-(4+rJ+ ® m WT dVxdVydVt 


(12-35) 


The probability that the absolute value of v shall lie between v and v + dv 
is derived from this expression by transforming the “ volume ” element 
dv x dvydv z to spherical coordinates, where it takes the form v 2 dv sin ddddtp, 
and then integrating over 0 and v?. Thus 

P(v)dv = 4« 2 f-^-Y /2 e- (m/2iT) ’ 2 di; (12-36) 

\2irkT / 


According to its derivation P(v) denotes the probability that one 
molecule shall have a speed about v . It is then clear that vP(v) represents 
the number of molecules having this speed. It is this last interpretation 
which is usually given to Maxwell's law. 

For most purposes it is convenient to write the canonical distribution 
law, eq. (28), in a slightly different form. When we put C, the positive 
constant occurring in that equation, equal to Ne* l9 j where ^ is a new param- 
eter depending on 0, we have 

P(H) = (12-37) 

the standard form used by Gibbs. 

In conclusion, let us attempt to correlate the quantities H , \p, and 0 
with thermodynamic quantities. This can be done through the thermo- 
dynamic relations, the most important of which are: 

dU = TdS - (12-38) 


— dA. = SdT Y f .dft.- 

* 


and 


(12-39) 



451 


CONSIDERATIONS REGARDING THE CANONICAL ENSEMBLE 12.7 


Here U stands for the internal energy of the system, S for its entropy, A 
for the Helmholtz free energy. The force f { is defined by = - dH/dfr, it 
is called into play when the i-th external coordinate is changed. The sum- 
mation represents, therefore, the total work done by the thermo- 

dynamic system when it undergoes a (reversible) change involving varia- 
tion of the £ t -. 

We now return to the ensemble whose distribution is given by (37). 
This distribution will change in detail as the condition of the system 
changes. But it will change in such a way that 

f = 1 


On differentiating this relation (permitting the external parameters & as 
well as 6 and hence yp to be altered) we have 


d f e ( *- H)/s d<t, = f p 

di lTTF 


# i - H 


L0 


0 2 


dd 


& % 


dQ + - £ fid£i = 0 

C7 i 


(12-40) 


provided we use eq. (37) and indicate averages over the ensemble by hori- 
zontal bars, i.e., 3 = f PQd<j>. The last equation may be written 


-# = -In P dd + Hfidh (12-41) 

p - R 

But since In P , we also have 

9 

P = 9 • h TP + B (12-42) 

dp = dR + 9 d( In P) 4* In P dd 


When this is substituted in (41), the result is 

dR = -dd(lnP) - (12-43) 

i 

Now it is clear that dR must be identified with the increase in total energy 
of the thermodynamic system, dU. We have already established the re- 
lation 9 = kT. Furthermore, the fi can hardly be anything other than 
the actual forces acting on the real system. We then see that (43) is the 
exact analogue of the thermodynamic relation (38), provided we interpret 
— In P as entropy divided by Boltzmann’s constant, k. 

When eq. (41) is now compared with (39), p is at once seen to be the 
Helmholtz free energy, A. With this additional interpretation, eq. (42) 
becomes the familiar 


A = U-TS 



12.8 


STATISTICAL MECHANICS 


452 


Further pursuit of this matter shows that all thermodynamic relations are 
satisfied if we correlate thermodynamic with statistical quantities as 
follows: 


Total energy (17) corresponds to H 
Absolute temp. ( T ) corresponds to d/k 
Entropy ( S ) corresponds to —k In P 
Helmholtz free energy (A) corresponds to $ 


(12-44) 


When A is given, many thermodynamic properties of the system at hand 
are known (cf. Chapter 1). It is therefore important to know how to com- 

pute the free energy, i.e., yp ) statistically. Since f = 1, we 

have e^ /e = J e~ H/e d4>. The integral J e ~ H/6 d(f> = /, which is thus 

seen to be basic in the evaluation of is often called the phase integral 
(also “ sum of state ” and “ partition function ”). In terms of it 

\p = — kT In/ 


Problem. Using (36), verify the following relations for the first and second 
moments of the velocity distribution of an ideal gas: 


i«l 


2 /2 kT\ 1/2 
Vr \ m ) 


1 * kT 
v l = 

m 

Show also that the most probable velocity is (2 kT/m) 112 . 

12,8. The Method of Darwin and Fowler. — A statistical method differ- 
ent from that of Gibbs but also leading to the correct thermodynamic 
laws, and more adaptable to the needs of quantum mechanics, has been 
introduced by Darwin and Fowler. 10 We shall first describe its funda- 
mental features and then use it to derive quantum mechanical distribu- 
tion laws. Consider a system made up of v similar particles. No refer- 
ence to an ensemble will here be made ; all arguments concern this single, 
real system. If the particles are independent , as will now be assumed, each 
individual particle may be said to be in a definite energy state c*, this e* 
being an eigenvalue- of the Schrodinger equation (see Chapter 11) for the 
single particle, with boundary conditions corresponding to the volume of 
the total system if the latter is, a fluid, or other suitable conditions if it is a 

10 Phil. Mag. 44 , 450, 823 (1922); 45 , 1, 497 (1923). For a general and more 
recent treatment see Fowler, R. H., “ Statistical Mechanics,” Second Edition, Cambridge 
University Press, 1936. 



453 


QUANTUM MECHANICAL DISTRIBUTION LAWS 


12.9 


crystal. By the slate of the total system we mean the aggregate of single 
particle states, that is the assignment of individual particles to the various 
energies e*. But a microscopic assignment of particles to energies, as in the 
statement: particle 1 has energy c*, particle 2 has energy e ? *, particle 3 has 
energy e k , etc., has no meaning in quantum mechanics in view of the exclus- 
ion principle. The best that can be done in specifying a state is therefore 
to say that a x particles have energy ei, a 2 particles have energy e 2 , * • • a* 
particles have energy e if etc. Thus a state is defined when a system of 
“ occupation numbers,” a X) a 2 , * * * is given. These must obviously 
satisfy the relation . 

2X = v (12-45) 

X 

We shall also prescribe that the system shall have a fixed total energy E, 
so that 

2 law - E . (12-46) 


Now it is possible, as will be shown in the next section, to assign a 
statistical weight , w{a x • • • a s ) to each state a x • • * a s . The average of a 
quantity Q(ai • • • a s ) which takes on different values for different states is 
then defined by 


Q = 


&(fli • • • a s )Q(fli --•&*) 

2>(ai ■ • • a,) 


(12-47) 


The summations in this expression are understood to be taken over all 
values of a 1} a 2 , a 3) etc., which satisfy conditions (45) and (46) ; the index s 
is in general very large; it is given by e fi ^ € a+ i > E. Contact with 
experience is made in the Darwin-Fowler theory by assuming that Q is the 
observed value of the quantity Q when a measurement is made on the 
system. The Gibbsian ensemble average is here replaced by an average 
over the states of a single thermodynamic system. The fact that they 
agree is rather noteworthy from a logical point of view. To carry through 
the calculation of an average like (47) it is necessary (a) to. construct a 
suitable weigh ting, function, w ; (b) to. devise means for evaluating the 
restricted sums appearing in that equation. 

12.9, Quantum Mechanical Distribution Laws. — In quantum mechan- 
ics, the weight of an energy state is defined as its degree of degeneracy: it is 
equal to the number of linearly independent state functions belonging to 
the eigenstate in question. This postulate will here be invoked. Our 
system, however, is one containing v similar particles; hence it is necessary 
to apply all the considerations of secs. 11.32 and 11.33, in particular the 
exacting demands of the exclusion principle. But for the moment it seems 
well to consider the number of eigenstates of E belonging to the statistical 
state (ai • • • a a ) when the exclusion principle is left out of account. 


12.9 


STATISTICAL MECHANICS 


454 


As an example, consider the simple case of 5 particles, with energy 
partition ai = 2 and o 2 = 3. Let the single-particle function belonging 
to the single-particle energy ej be V'i, that belonging to « 2 , Pi- The Schro- 
dinger equation for the 5 particles corresponding to E — 2e t + 3 « 2 will 
then be satisfied by the simple product 

*i(1)*i(2W 2 (3)iM4)* 2 (5) (12-48) 

as well as by any function obtained from this product through permutation 
of the arguments 1 to 5 (each numeral designates all coordinates, including 
the spin, of the corresponding particle). But not all the 51 products thus 
obtained are independent. For instance a transposition of 1 and 2, or a 
permutation among particles 3, 4 and 5 causes no change in the function. 
The number of different combinations is obviously equal to the number of 
ways in which 5 objects can be arranged in 2 piles, one containing 2 the 
other 3 objects. This, according to eq. ( 1 ), is 

jn 

213! 


The generalization of this result is immediate; the number of different 
energy eigenfunctions belonging to (aia 2 ■ • • a,) is given by 


uj(oi ■ • ■ a.) 


y! 

aila 2 ! •••<!,! 


(12-49) 


This is true so long as each individual particle function, Pi, is non- 
degenerate. Suppose now that the energy e* itself can be realized by 
different functions. Each pi then has a weight g,-, and the product of aj 
such functions has a weight gV- We then obtain, in place of (49), the more 
general result 


w(a i • • • o.) 


r[ -QV 
Oi la 2 1 • • • a, 1 


(12-50) 


To see this in detail, let us return to the example (48) and assume g\ — 3, 
02 = 2 . It is then necessary to introduce new functions, e.g., b, c, d in place 
oi pi e and / in place of p%. Instead of Pi(l)pi(2) we can now have 

b(l)5(2) f c(l)c( 2 ), d(l)d( 2 ), b(l)c( 2 ), b( 2 )c(l), 6 (l)d( 2 ), 
6 ( 2 )d(l), c(l)d( 2 ), c( 2 )d(l) 

and in place of ^ 2 (3)^ 2 (4)^ 2 (5) 

e(3)e(4)e(5),/(3)/(4)/(5), e(3)/(4)/(5), e(4)/(3)/(5), e(5)/(3)/(4), 
e(3)e(4)/(5), e(3)e(5)/(4), e(4)e(5)/(3) 

Eq. (50) would be the statistical weight of the state aj • • • a, if no symmetry 
requirements, no Pauli principle, had to be respected. 



455 


QUANTUM MECHANICAL DISTRIBUTION LAWS 


12.9 


How many different functions can be constructed out of the individual 
when overall antisymmetry is demanded? We have seen that the only 
antisymmetric function available has the determinantal form eq. (11-160'). 
This, however, vanishes when any two particles are described by the same 
for then two rows of the determinant are equal. Hence, if there is no 
degeneracy in the individual particle functions (all g* are 1), only one func- 
tion is constructible; in other words, 


w(a i • • • a 8 ) 


f 1 if every ai ^ 1 
1 0 if any a t * > 1 


On the other hand, if the i-th state has a degeneracy gi > 1, the num- 
ber of non- vanishing determinants which can be constructed is equal to the 
number of ways in which a t - arguments can be distributed among g v differ- 


ent functions, and this, by virtue of eq. (2) is 



Thus we obtain in 


general, when the exclusion principle is applied, 


w(ai • • • a 8 ) 



(12-51) 


Note that this vanishes when any a*- is greater than its corresponding g y, so 
that the preceding equation is a special case of this. 

Finally, we consider the case in which the total function is symmetrical. 
As was shown in sec. 11.33, this is of the form where ^p is a function 

p 

constructed like (48), with a particular permutation of the arguments 
1 to v. But if any permutation of arguments is made in 2 this func- 

p 

tion is transformed into itself. Hence w(a x • • • a s ) is always 1 provided all 
gi are 1 . But if this is not true ) then the degeneracy of €{ gives rise to as many 
different combinations of functions • • • i as there are ways of 

distributing the a{ arguments amongst them ) without regard to the number of 
arguments associated with the same function. This number, by eq. (4), is 

) . We have thus determined w for the symmetrical case 
ai / 

to be 

Assemblies of particles whose motion is governed by the Pauli principle 
must be described by antisymmetric functions. Their statistical weights 
are given by (51). The formulas which ensue from its use are characteris- 
tic of Fermi-Dirac statistics, the type of statistics to which electrons, 
neutrons, protons are subject. Henceforth we refer to (51) as w FiD .. On 




12.9 


STATISTICAL MECHANICS 


456 


the other hand, photons, nuclei and atoms containing an even number of 
elementary particles (e.g., He 4 ) are known to require symmetrical state 
functions for their collective description. Their statistical states have 
weights given by (52); they are said to obey Einstein-Bose statistics. 
Henceforth we write w BtBm in place of (52). No known constituents of 
material bodies are described by (50), although it is precisely that assign- 
ment of weights which leads to the Maxwell-Boltzmann law. It will be 
shown, however, that both quantum formulations, (51) and (52), give rise 
to distribution laws which under many thermodynamic circumstances are 
practically identical with the Maxwell-Boltzmann law. For this reason, 
and for the sake of generality, we continue to include eq. (50) in our con- 
sideration, and refer to it as w c n (classical assignment of statistical weights). 

It is to be noted that all three statistical weights may be written in the 
form 


if we put 


w = Il 7 (a J -) 
i 


y c (aj) 


Ml 

dji 




y F .M = (^) / 

y*.M = (*’ *) 


(12-53) 


In proceeding thus the factor v\ in eq. (50) is being omitted. However, 
since this factor is independent of the a’s and hence constant for all statisti- 
cal states, it will cancel when averages are computed after the manner of 
eq. (47). This is the principal use which will here be made of the weight 
function. In many other problems the omission of y\ is not permissible. 
(See the remarks after eq. 72.) . .. 

The quantity Q whose average we wish to calculate i$,a r , the number of 
particles having a given energy e r . It is necessary, therefore, to evaluate 



2^u r lXy ( a,j ) 


(a) i 

£n 7 (a y ) 


f . (a)-i -'. .» ; . V. 


A 

w 


(12-54) 


W is here written for the .sum of all statistical 'Weights compatible with 
the fixed energy E, A for a r W. The summations appearing in (54) are 

11 The formula for w f can also be derived as follows, yvithout reference to state 
functions. Divide phase space into cells according ' to the energies of the individual 
particles: in the i-th cell a particle has energy ‘it ! * lithe i-th cell has' a fundamental 
weight 'gii then the number of ways* in which The state ai, ; #2 • * v ^ at cah ' be- realized iy 
assignments of specific particles to cells ,■ is. given by, eq; ^ 0). 1 :.*\2 ’ \z- r y> 



457 


QUANTUM MECHANICAL DISTRIBUTION LAWS 


12.9 


taken over all values of a X} a 2f a 3 • • • a 8 which satisfy both (45) and (46). 
We first calculate W. 

For this purpose consider the expansion 

M= £ fn{ 7 M,W (12-55) 

aia 2 • • • a s L ; J 


in which the summation over the a’s is entirely unrestricted, each one of 
the many a's taking all values from 0 to co . M may be regarded as a func- 
tion of x and z, depending parametrically upon the eigenvalues €y character- 
istic of the particles in question. A moment's reflection will show that 
W is the coefficient of x v z E in the expansion M; in other words, because of 
the theorem of residues eq. (3-3), 


W 


- 


Mdxdz 

x vJtl z E + l 


(12-56) 


the integrals being taken counter-clockwise about the poles of the inte- 
grand, i.e., about x = 0 and z — 0, x and z being considered as complex 
variables. 

Now M may be evaluated rather simply. First note that it can be 
written 

M = £ U{y(a 3 )x a ’z ait i} 

a i««'Oa 3 

n-nti j 


= II 2 7 (n)x n z 

- 3 U =0 J 

The summation in { } can be performed for all three of the functions y 

CD 

listed in (53). Let us put xz *’ = r, £ y(n)r n = f(r). We then obtain 

' ‘ .i’. • = : * • • n = 0 . 

fc = £ ~r~ = ex P ((h r ) = exp ,[g i xz ti } 


(12-57) 


Sf.d, ~ X 

n=0\n, 


r n = (1 + r) ai = (1 +xz li ) ui 

■ ■ £ ( S [ + l ) r n = (1 - xz<T°' 

The last result, which is perhaps not so obvious, is easily verified by writing 
down the MacLaurin expansion of 


(1 - r)-° = 1 + r + 


9(9 + 1) .2 , 9(9 + l)(ff + 2) , , 
r i — r + 


2 ! 


3! 


which is identical with the summation in Jem.- 



12.9 


STATISTICAL MECHANICS 


458 


Thus we see that 

M = n fixz't) (12-58) 

and W is related to M by eq. (56). The effect of the degeneracy factors Qj 
in (57) is rather interesting: they merely appear as exponents in the /-func- 
tions in all three cases. 

Let us now consider the numerator of eq. (54). If we calculate the 
quantity (1/ln z)(dM/dt r ), we find* using eq. (55), 


1 _c^ 
In z dt r 


L in 7 (a,0* o '(e In W 

fliaj“*aa ^ j J 


L a r {n 7 (a ; )x a ' 2 “'‘'\ 

aia2—a 4 ' j J 


On comparing this with (54) we see that A is the coefficient of x v z E in the 
expansion of (1/ln z)(dM/de r ). But this last quantity can be put in a 
form more suitable for our purposes. In view of (58), 

-i- — - _Lj E 

In z de r In z \f(xz er ) d(xz €r ) j 

= Mx^-\nf{xz») 


Summarizing these steps, we note: 

A ' Gs) ff x h' a (12_59) 

If now it were possible to find a path of integration around the origin 
with respect to both x and z ) such that the function Mx~ v ~ x z- B ~ x were 
practically zero everywhere along that path except in the immediate 
neighborhood of two definite points, say x = £ and z .= #, the evaluation 
of a T = A /W would be very simple. For it would then be permissible to 
take the factor x(d/dx) In/, multiplying the integrand in (59), in front 
of the integral sign and give to x and z the values £ and #, and the integrals 
themselves would cancel. We should then have 

Sr = ^ — 1 n/(^‘ f ) (12-60') 

This procedure is indeed justified, as the following section will show. If 
the /-functions are identified in accordance with (57), the result is seen to be 

(flr)c = 

^ g ^‘ r 
Wf.d. - £-1 + £«, 

9 

r 1 - t? er 


{Ot)e.B. ~ 



459 


THE METHOD OF STEEPEST DESCENTS 


12.10 


The physical significance of the parameters £ and $ can be fixed most simply 
by the subsequent plausibility argument which we offer in lieu of more 
detailed considerations 12 adducible to establish this meaning more com- 
pletely. The first of the expressions above must be the Maxwell-Boltz- 
mann law which reads, for the situation here considered, a r = A Q g r e~' trlkT , 

the factor A 0 being so determined that £a r = v . Hence £ must be identi- 
fy 

fied with A 0 , and with e~ llkT . In the other two relations £ must also act 
as a normalizing factor, while # has the same meaning as in (a r ) c . Hence 
we conclude 


(a r )c = ^re- (r,kT 


(a) 


g re -*r/kT 

( 5 r)F.D . “ ^-1 € -e r IkT 


(b) (12-60) 


(ft t)e.B . 


Qre 


,-trlkT 


£-1 _ e -*rlkT 


(c) 


It is easily seen in a qualitative way that £ must increase when v in- 
creases (the volume of the system being fixed) if £a r is to remain equal 
to v. In (a), £ is in fact proportional to u, but this simple dependence 
fails in (b) and (c). Nevertheless, if v is very small, f~ l ^> 1 > e~ trlkT . 
In this case both (b) and (c) reduce to the classical form (a). Hence for 
sufficiently small densities all assemblies show an essentially classical 
behavior. Closer investigation (see any of the references at the beginning 
of this chapter) indicates that this is true for all ordinary molecules at 
ordinary temperatures, thus justifying the use of classical statistics. 
The main instances in which quantum distribution laws are needed are the 
motion of electrons in metals (b), the photon gas, and helium at very low 
temperatures (c). 

All thermodynamic relations can be deduced by the method here 
described provided the following associations between thermodynamic 
quantities and elements appearing in the Darwin-Fowler scheme are 
made; the first is obvious, the second has already been obtained; the 
others will be derived later on (cf . eqs. 71 and 72) : 

V corresponds to E *\ 

T corresponds to — ( k In tf)”* 1 i (12-61) 

S corresponds to k In W ) 


vA corresponds to —kT(E In /(£t?‘ y ) — v In £) 

i 

12.10. The Method of Steepest Descents. — The evaluation of a r in the 
last section depended for its validity on our ability to find a point x — { in 
12 Cf. Fowler, loc. cit. 



12.10 


STATISTICAL MECHANICS 


460 


the integration with respect to x, and another point z = d in the integration 
with respect to z, at which the integrand of (56) would be large, and 
where its value would descend very steeply on both sides along the path 
of integration. Such a point has interesting properties which we shall 
first investigate in connection with a simpler but more general example. 

Let <p(z) be a function in the complex plane, z being x + iy . We wish 
to find a point in the X , y-plane such that, as we cross that point in the 
direction of steepest descent of (p, (p will be a maximum at the point. To 
be more specific we will refer in this inquiry not to <p y but to the real part 
of (p. Suppose the point thus defined is z = d. 

On writing 

<p(z) = g(x,y) + ih(x,y) (12-62) 

where g and h are real functions, it is clear that these must satisfy the 
Riemann relations: 

Qx = h y , g v « —Ax 13 (12-63) 

Our specification amounts to this: dg = g x dx + gydy shall be zero along 
the path on which g decreases most rapidly. The direction of this path is 
the direction of the negative gradient of g , namely —Vg. Since this is 
— (i g x + j g v ) 7 this direction is defined by dy/dx ~ g v /g x > But by reason 
of eq. (63), this is —h x /h y . On the other hand, if dy/dx = ~~h x /h yy then 

hxdx + hydy = dh — 0 

in the same direction. Now the vanishing of both dg and dh at d is possible 
only if = 0. We may conclude, therefore, that the point in question, 
if it exists at all, satisfies the condition 

<p — 0 

If <p' has a real root, the point & will obviously lie on the real axis. 

Next, it will be shown that d is a “ saddle point/ 1 i.e., that the curvature 
on the path of steepest descent is opposite to that along a path at right 
angles to this direction. For any direction dy/dx 

d?g = gxxdx 2 + 2g X ydxdy + g vy dy 2 
13 By q x is meant dg/dx, etc. To prove these well known relations we observe 

<Px = <p - Qx + ih x 
<Py *=* i<p f ~ g y + ihy 

with <p r — dtp/dz . Hence v' = g x + ih x = —ig v + hy y which is equivalent to eqs. (63) 
when real and imaginary parts are equated separately. Note also that (63) implies: 

QXX — ~~hyy. 



461 


THE METHOD OF STEEPEST DESCENTS 


12.10 


The direction at right angles to dy/dx is fixed by the substitution of -dy 
for dx, and of dx for dy. Hence, on writing d 2 g L for the curvature at right 
angles to the direction dy/dx, 

d 2 g x = <7 xxdy 2 - 2 g xv dxdy + g vv dx 2 

so that 

d 2 g + d 2 g L = (g xx + g yv )(dx 2 + dy 2 ) = 0 


in view of eqs. (63). 

With this general knowledge, let us return to the calculation of (56) : 


W = 


(2iri) 


-2 


// 


M(x,z) dx dz 
x v z E X z 


(12-64) 


M = II /(r ; ), r j = xz‘J 
j 


Let us further put 

Mx~ v z- e = 6 r( *’ 2) 

so that ? 

Y ~ 2 In f(rj) — v In x — E In z 

3 


(12-65 , 


A saddle point of the integrand of (64) is then determined by 


dY 

dx 


= 0 


in the integration with respect to and by 


dY 

dz 


= 0 


in the integration with respect to z. The first of these leads to 


the second to 


$'/ — — = o 

jfM * 

1 v /(py) n 

* ? f(Pj) d 


( 12 - 66 ) 

(12-67) 


where p ; - has been written for Eqs. (66) and (67) define the saddle 

point (£,t?) in X - Z space: 



12.10 


STATISTICAL MECHANICS 


462 


These results at once take 
eq. (60 7 ) : 


on a more interesting form when we insert 

f 

-ar = t’j * 


The saddle point conditions are then seen to be nothing more than the 
conservation conditions of our problem : 

Z fli = v 

3 

= E 


In classical statistics we may also write 

= v 


£Lg }$*’<} = E 

3 

In quantum statistics, the equations become 


£2 Zqj 

3 

£Loj 


1 rh 

Fiej 
1 i 


v 


E 


where the positive sign is to be taken in the Fermi-Dirae case and the 
negative sign in the Bose-Einstein case. 

In the following we shall also need the values of d 2 Y/dx 2 , d 2 Y/dz 2 , and 
d 2 Y/dxdz at the saddle point. To save writing the discussion will be 
limited for the present to F.D. statistics. Here one finds 14 


d 2 Y 1 f#*/ _ A 

dx 2 w f T j (1 + ?* £ 0 2 " ? 



The quantities A ) 5, and C are to be defined by these relations; it is 
easily seen that, for a gas consisting of many particles, they are very large 

14 Note that the symbol A has a different meaning than in the last section! Neither 
is it to be confused with the Helmholtz free energy. 



m 


THE METHOD OF STEEPEST DESCENTS 


12.10 


numbers. 16 Moreover, the reader will be able to show that 

AB > C 2 (12-68) 

always. 

Having located the saddle point, we expand Y(x,z) in its neighborhood. 
Y(x,z) = Y (£,<?) + ^ (* - {) + ^ (z - (12-69) 

au 

Td 2 y f) 2 Y rl 2 V 1 

r rhe first derivatives vanish at the saddle point, which, as may be shown 
from eqs. (66) and (67), lies on the real axis of both x and z. In order to 
carry out the integrations in (64), it is suggested that the paths be taken 
across £ and d, and this is done with greatest convenience by choosing the 
circuits 

X = ~r < a ^ ir\ z — — it < /? ^ ir 

When these substitutions are made in (69), this expression becomes 
Y(x,z) = Y((,0) - \{Aa 2 + Bp 2 + 2 Cap) 
in the neighborhood of £, since for small a and jS 

x — £ = i£ct and z ~ = i&/3 


Therefore, in view of (65), 

MX~ V Z~ E = e Yi ^ } e -(V2)Ua 2 +^+2C«/9) 

and 


W=(2ri)~ 2 ff e ™- W2KAaI+Bfi!+2CaB) (ida) -(idp) (12-70) 


This result shows with impressive clarity how rapidly the integrand 
“ descends ;; from its saddle point: its " half width ” with respect to a for 
example, is approximately given by A~ l/2 . But A is of the same order of 
magnitude as v, the total number of particles. The procedure of the fore- 
going section was therefore proper. 

The question arises as to the behavior of the integrand at points on the 
contour not in the neighborhood of the saddle point, for we hardly have 
reason thus far to expect that it is small everywhere else. This, however, 
is not difficult to prove. When written in terms of a and 0, the function f 
of (57) takes the form 

Sf.d . = (1 + e tUj ) 0 3, Uj — a 4- (3ej 

16 The €j must here be regarded as dimensionless and of order of magnitude unity 
or greater. This can be achieved by measuring kT and ^ in the same conveniently 
chosen unit, in which case kT and hence also, become pure numbers. 



12.10 


STATISTICAL MECHANICS 


464 


and M becomes 

M = XI(1 + 2£# e j cos Uj + £ 2 # 2c j) { 7 ' 7/2 exp - igj tan” 1 — — — S / n -- J - 

3 [ l + £i^eosw ; 


The product of all exponential terms, which are purely imaginary, can never 
exceed unity, the value which it has at the saddle point. Each term in the 
first parenthesis attains its maximum when Uj = 0, or in general 2nir . If 
Uj = 2?i7r for a given choice of a and 3 there will be very many u^, k ^ j, 
for which the parenthesis will not assume its maximum value, so that the 
product, having a great number of factors, will be much smaller than its 
value at The only way to insure that M will be a maximum is to 
make Uj = 0 for all j y and this requires that both a and 0 be zero. This 
maximum will be very strong provided the number of energy states, ej, 
is large. 

It is of some interest, finally, to conclude the explicit calculation of W . 
The integral in (70) is easily performed, for the limits may clearly be re- 
placed by + oo and — oo. Remembering the formula 


we find 



W = 


2WAB - C 2 



which, in view of (68), is real and positive. The entropy, defined in (61), 
becomes therefore 

8 = kY({, i?) - k In [2 WAB- C 2 ] (12-71) 


In chemistry, it is customary to neglect the second term of S because it is 
much smaller than the first when the number of particles is large. 16 Now 


F(S,t>) = In - v In $ - E In a 

= Z In /«**')- ^In ^ + 

In view of (71), then, 

E - ST = -kT ( £ In / — v In f) (12-72) 

This justifies the identification of the free energy made in (61). ' ; ' 
Finally let us endeavor to make contact with classical statistics again. 
Here it must be remembered that a factor v\ was omitted in the evaluation 
of W. We must therefore add to (71) the quantity k In ?!, so that we have 

16 The full expression must be used when attention is given' to ! the entropy of a 
nucleus, which contains relatively few neutrons and protons. ' * i* i . 



465 


THE METHOD OF STEEPEST DESCENTS 


12.10 


in place of (72) for the classical free energy -4 c i„ m . 

v • jloiaa, = -&T(£ In/ — v In £ + In vl) 
But 


by eq. (66') and 
by Stirling’s theorem. 


£la/= v 


In »>! a* v In v — v 
Hence 



Again by (66'), v/£ = a quantity to be denoted by Z and often 

3 

called the partition function. In terms of Z, 

class = -kT In Z 


Comparison with the last equation of sec. 12.7 shows that Z is the quantum 
mechanical analogue of Gibbs' phase integral. 17 

The computational aspects of the statistical method can be simply 
summarized as follows. Given a system of v particles whose total energy 
is E . Each particle has energies ey obtainable by solving the Schrodinger 
equation with boundary conditions corresponding to the volume in which 
the particles are enclosed. For instance, if the volume is a parallelepiped, 
the €j are given by eq. (11-36). Thus they depend on the volume of the 
container. 

The thermodynamic properties of the system tnen depend on two 
parameters, £ and defined by eqs. (66') and (67') . When these are 
solved simultaneously and £ and $ are known for the given v and E, the 
quantities T, S and A can be calculated from (61). In F.D. and E.B. 
statistics, eqs. (66') and (67') are such that there is no general method for 
obtaining explicit solutions for £ and d. Recourse must then be had to 
approximations, valid in different ranges of the parameter £. 18 

Problem. Find the values of A,B,C in classical and in E.B. statistics. Note that 
the classical values are obtainable from those derived in the preceding section by 
letting £— >0. 

17 Partition functions for specific substances can often be computed from spectro- 
scopic data. Such calculations are becoming increasingly important in applied thermo- 
dynamics. See Taylor, H. S., and Glasstone, S., “ A Treatise on Physical Chemistry/* 
Third Edition, Yol. 1, D. Van Nostrand Co., Inc., New York, 1942 

18 See for instance Fowler, loc. cit. 


STATISTICAL MECHANICS 


466 


REFERENCES 

Tolman, R. C., “ The Principles of Statistical Mechanics/’ Clarendon Press, Oxford, 1938. 
Chapman, S., and Cowling, T. G., “ The Mathematical Theory of NonUniform Gases,” 
University Press, Cambridge, 1939. 

Mayer, J. E. and Mayer, M. G., “ Statistical Mechanics,” John Wiley and Sons, Inc., 
1940. 

Lindsay, R. B., “ Ph}'sical Statistics,” John Wiley and Sons, Inc., 1941. 

Schrodinger, E., u Statistical Thermodynamics,” Cambridge Press, 1948. 

Rushbrooke, G. S., 11 Introduction to Statistical Mechanics,” Oxford Press, 1949. 

Ter Haar, D., “ Elements of Statistical Mechanics,” Rinehart and Company, NewYork, 
1954. 



CHAPTER 13 

NUMERICAL CALCULATIONS 


13.1. Introduction. — We describe here certain types of numerical 
calculations which are often required. No theory 1 is presented, but the 
methods are explained and illustrated by means of worked examples. 
The reader will find that such computations are usually tedious and time- 
consuming, hence he should exercise his ingenuity in devising means of 
reducing the labor involved. Before starting a calculation, he should 
always consider the possibility of using graphical methods, for these are 
often simpler than the numerical ones. He should also remember that 
there is some advantage in representing numerical data by equations of 
empirical or theoretical form. Such equations, obtained by the method 
of least squares or otherwise (see sec. 13.37) are generally easier to use for 
interpolation, differentiation or integration than the methods of this 
chapter. Finally, he should note that when alternative procedures are 
given for a particular operation, the special problem at hand may often 
suggest which of these is the most suitable. 

It is assumed that the reader is familiar with the elementary facts 
concerning significant figures, rounding off and number of significant 
figures to be retained in addition, multiplication, etc. 2 

For convenience, we divide this chapter into three separate parts. 
The first deals with methods primarily based on interpolation formulas; 
the second, with miscellaneous algebraic calculations and the third with a 
discussion of errors and related problems; 

PART 1. NUMERICAL METHODS BASED ON 
INTERPOLATION FORMULAS 

INTERPOLATION 

13.2. Interpolation for Equal Values of the Argument. — It often hap- 
pens that data are given in tabular form with values of x and y — f(x) at 
certain intervals of x. Suppose a value of y is needed for an x, which is not 

1 See references at end of chapter. 

2 Retention of an unnecessary number of significant figures should be carefully 
avoided, especially in physical and chemical calculations. If (n -f 1 )-digits are carried 
along.in the intermediate stages of the calculations, the final result, obtained by round- 
ing-off and thus containing n significant figures, will be uncertain by one or two units in 
its last digit. This practice is customary in treating scientific data. 

467 



13.2 


NUMERICAL CALCULATIONS 


468 


listed in the table. Usually, the simplest procedure is to plot y against x, 
draw a smooth curve through the points and read from the graph the 
required value of y. The same result may be obtained by the use of 
interpolation formulas. Provided the given values of x are equidistant, 
we first form a difference table 3 as shown in Table 1 where the first, second, 
third and r-th differences are given by 

Ay 0 = 2/i ~ Vo, A2/i = 2/2 - 2/1, • • Ay„_i = y n - 2 / n _ !( 

A Vn = Vn + 1 ~ Vn 

A 2 ?/o = A y x - Ay 0 = y 2 - 2yi + y 0 , • • • 

A 2 y n = Ay n+l - A y n = y n+2 - 2 y n+1 + y n * 

A 3 i/„ = A 2 y n+ i - A 2 y n = y n+3 - 3 y n+2 + 3y„ + i - y n 


A r Vn = A r “Vn +i ~ a r "Vn = 2/n+r “ d p— ^ y n+r _ 2 + 


21 


+ (~l) r 2/» 


r 


= X 



(13-1) 


TABLE 1 


X 

y 

A 

A 2 

A 3 

A 4 

A 5 

A 6 

*0 

Vo 







Xl 

y i 

Aj/o 

0 





X2 

2/2 

Ayi 

A 2 !/0 

A 3 J/o 




33 

2/3 

&V2 

aV 




34 

2/4 

Ays 

A 2 3/2 

A 3 yi 

A 4 !/o 



3 5 

2/6 

Al/4 

A 2 2/3 

A 3 J/2 

A 4 2/i 

A 6 ?/o 


3fl 

2/6 

A?/5 

A 2 V4 

A 3 I/3 

A 4 j/ 2 

A 6 yt 

A 6 ?/o 


In forming such a table of differences, care must be taken to maintain the 
correct signs; the subtractions must all be performed in the order given in 
(1). A convenient check may be obtained by noting that the sum of the 
entries in any column equals the difference between the first and last 
entries in the preceding column. It also happens in most cases that the 
differences of some order will be zero or will vary (perhaps with alternating 
signs) only in the last few figures of the numbers retained. This is the 
basis for all of the methods described in the first part of this chapter, for if 
the unknown /(x) were a polynomial of the n-th degree, the n-th differences 
would be constant and the ( n + l)-th differences zero. 

3 Many different notations and forms cf the difference table will be found in books 
on numerical methods but it will usually be simple to find the relations between the 
various symbols used. 


469 


interpolation for equal values of the argument 13.2 

Now if x k and y k are values given in such a table, h is the common 
interval of x, h - x k - x„ = x 2 - x x = • • • = x „ - and 


x = Xk + hu; u = 


(* - **) 


(13-2) 


then a value of y for an x not contained in the table is given by Newton’s 
interpolation formula, 


V - Vk + uAy k + — • — - A '■‘yk + 


«(« ~ 1) a2 .._ «(« - l)(it - 2) 


3! 


Ay* + 


+ «(« - 1) (a — 2) • . • (m - r + 1). 


r! 


A r y* 


(13-3) 


A second useful form of this equation may also be obtained: 


V = Vk + «Ay*-i + 


w(« + 1) 2 u(m -f l)(u + 2) 

~ 2 T ~ A ^ 2 + 31 


A 3 y*_a + 


+ 


w.(m + 1) (« -f 2) • • • (w + r — 1) . _ 

* “ , — A Tf 


rl 


A r y * — ,r 


(13-4) 


It will be noticed that (3) involves differences lying on a diagonal line in the 
table, starting from y k , while (4) uses differences on a horizontal line from 
y*- Thus (3) should be used for interpolation near the beginning of a 
difference table and (4) for interpolation near the end. Summation should 
be continued until the desired number of significant figures is obtained. 
These two formulas may also be used to extrapolate at both ends of the 
difference table but due caution should be used in such cases unless it is 
known that the function is continuous beyond the tabulated values. 

Example 1. Interpolate in Table 2, to find y - e~^ for x — 0.0477. 
We take x* = 0.05, thus h = 0.05, u = — 0.046. Using (3), 

y = 0.99750 + 4.6 X 7.45 X 10~ 8 - 4 - 6 - X 1-046 X 4,85 x 10“ 8 

2 

4.6 X 1.046 X 2.046 X 1.9 „ 

x 10 -a 

- 0.99750 + 0.00034 - 0.00012 = 0.99772 


It will be noticed that the third and fourth differences are too amnll for 
consideration. The result is correct to the last figure given as may be 
found by expanding e ~ x s in a power series. In this case, the calculations 
may easily be performed with a slide-rule- 



13.3 


NUMERICAL CALCULATIONS 


470 


TABLE 2 4 


X 

y =» e * 2 

A 

A 2 

A 8 

A 4 

0 

1.00000 





0.05 

0.99750 

- 250 




0.10 

0.99005 

- 745 

-495 



0.15 

0.97775 

-1230 

-485 

+10 


0.20 

0.96079 

-1696 

-466 

+19 

+9 

0.25 

0.93941 

-2138 

-442 

+24 

+5 

0.30 

0.91393 

-2548 

-410 

+32 

+8 


Example 2. Calculate y = e~ x 2 for x — 0.2862. Since this value is 
near the end of the table, it is better to use (4) with Xk = 0.30, u — — 0.276. 
Then, 

4 1 V 2 76 Y 7 24 

y « 0.91393 + 2.548 X 2.76 X 1CT 3 + - — X 10“ 6 

3.2 X 2.76 X 7.24 X 1.724 w 
6 X 

- 0.91393 + 0.00703 + 0.00041 - 0.00002 = 0.92135 


This result is also correct to the last significant figure. 

An arrangement of tabulated data, somewhat different from that of 
Table 1, leads to central difference formulas, notably those of Stirling and 
Bessel. While these converge faster than Newton’s formula, this advan- 
tage in most cases is of no practical importance. 5 

Problem. Interpolate or extrapolate from the data of Table 2 to find y = e”* 2 
for x = 0.045; 0.2775; 0.3018. 

13.3. Interpolation for Unequal Values of the Argument. — When the 
values'of x are given for unequal intervals, (3) and (4) do not apply, but it 
is possible to use divided differences or the interpolation formula of Lagrange . 
Both methods are tedious to apply and not very precise, hence it is usually 
better to interpolate from a suitable graph. We give Lagrange’s formula 
only; for the method of divided differences, Whittaker and Robinson 
(loc. cit.) may be consulted. Suppose x 0 , • • •, x n and y 0) y u • • •, y n 

4 Following the usual custom, we omit zeros after the decimal point in the various 
differences. 

5 For details concerning central differences, see references cited at end of chapter. 



471 


TWO-WAY INTERPOLATION 


13.5 


are known, then for some other value of x, 


V = /(*) 


= (x - xQ (a; - x 2 ) • • • (x - x n ) 

(*o ~ «i)(«o - x 2 ) ■ ■ • (x 0 - x n ) 
j (g - go) fr - Xg) • • • (x - Xn) 

(xi - ZoKzi - x 2 ) ■ • • (x x - x„) yi 

(x - Xp) (x - Xi) • • • (x - Tn-l) 

(X„ - X 0 )(X„ - Xi) • • • (X n - Xn- l) 


(13-5) 


Example 3. The following data were obtained in the calibration of a 
platinum-rhodium thermocouple. Find the temperature corresponding to 
a reading of 9.000 millivolts. 

t, °C. 630.5 960.5 1063.0 

e, millivolts 5.535 9.117 10.301 


With x = 9.000, £ 0 = 5.535, Xi = 9.117, x 2 = 10.301, 2/0 — 630.5, 2/1 = 
960.5, y 2 = 1063.0, 


(—0.117) ( — 1.301) (630.5) (3.465) ( — 1.301) (960.5) 

(-3.582) (-4.766) + (3.582) (- 1.184) 


(3.465) (-0.117) (1063.0) 
+ (4.766) (1.184) 


950.4°C. 


The value obtained from a carefully constructed curve is 950.2°C. 

13.4. Inverse Interpolation. — The problem of inverse interpolation, as 
the name implies, is that of finding a value of x corresponding to a given 
value of y = f(x). From Lagrange's formula, it is seen that the roles of x 
and y may be interchanged so that (5) may be used for inverse interpola- 
tion by rewriting it to give x = 4>{y). An illustration of this application 
of (5) is shown in the following problem. Inverse interpolation may also 
be effected by reversion of the series (3) or (4) to find u as a function of y 
and Ay. The unknown x is then obtained from (2) or by a method of 
successive approximations. Full details of both procedures are given by 
Scarborough (loc. cit.). 


Problem. From the data of Example 3, sec. 13.3, find the electromotive force of 
the thermocouple when the temperature is 750 6 C. 


13.6. Two-way Interpolation. — Suppose the tabulated quantity ie 
given as a function of two independent variables, for example, the index 
of refraction of water as it varies with both temperature and wavelength. 
Interpolation to give a value of y for two variables not contained in such 
tables is best performed by using Newton's formula to interpolate for each 



13.6 


NUMERICAL CALCULATIONS 


472 


variable separately. Series, similar to Newton's for direct two-way inter- 
polation, are given by Scarborough (loc. cit.). 


NUMERICAL DIFFERENTIATION 


13.6. Differentiation Using Interpolation Formula. — In order to deter- 
mine the numerical derivative of a function of £ at a given point, the slope 
of the curve of the function may be obtained by graphical means or the 
data may be fitted to an empirical equation which is then differentiated. 
We may also write 


dy = 

dx \du/\dx) 


(13-6) 


and if we use (2) and (3) we get 
= 1[" (2 u -1) 

dx hdu hi Vk 2! 


A 2 2/j b + 


At the point x ~ Xk, u = 0, so we have 


(3u 1 2 


3! 


6 u + 2) 3 
— A 3 y k + 


(13-7) 


(^) = \ ^ Ayk ~ 2 ^ Vk + ~ ^ Vk 1 

(|f) { ^ Vk ~ A ' Vk + ^ Aiyk 1 (!3-8) 


More terms and higher order derivatives may be readily found. Since the 
lower order differences disappear upon differentiation, the convergence of 
(8) is slower than that of (3) or (4), therefore derivatives obtained in this 
way are not very precise. 

Maxima or minima in a tabulated function may be found by substitut- 
ing the differences in (7), equating the derivative to zero and solving for u 
and then for® from the relation x = Xk + hu. 

Example i. •. Find dy/dx and d 2 y/dx 2 for y = e~ xi at the point x = 0.05 
from the data of Table 2. 



-[-0.00745 
0.05 L 

-0.09980 


0.00485 
2 + 


0.00019 

3 


0.00005 

4 


] 


1 T 0.000551 

(5S? i [-0.00485 - 0.00019 + — J 

- 2.00000 


The values found by differentiation are 

dy/dx ~ —2 xy = —0.099750 
d 2 y/dx 2 = 2y(2x 2 - 1) = -1.985025 



473 


INTRODUCTION 


13.8 


13.7. Differentiation Using a Polynomial. — Another method of finding 
the derivative has been described by Rutledge. 6 It does not depend on 
differences but assumes that the given data can be fitted to a polynomial of 
the fourth or lower degree. Five points must be known, that is, five values 
of x and y. If h is the equal interval between successive values of x } the 
derivative of y = /( x) at the point x = is given by the three following 
approximately equivalent expressions. 



12 h ^ ~ + tyk - 2 — l/k— 3 ] 

1 . (y k—2 yk-^~2) &(j/k — 1 

= I 2 X ^ +3 ~ 1 “ ~ 32/j^i] 


(13-9) 


These equations are particularly suitable for solution by one continuous 
operation with a calculating machine. The method may be extended to 
apply to polynomials of degree higher than four or to derivatives of higher 
order. 

Example 6. Find dy/dx at x — 0.15 for y = e~ xl using the data of 
Table 2 and the method of this section. 

(t) = To ' ^nns t 3 X 0-96079 + 9.7775 - 18 X 0.99005 + 

\dx/x-o.t5 12 X 0.05 

6 X 0.99750 - 1] = - 0,02934 

= A [(0-99750 - 0.93941) - 8(0.99005 - 0.96079)] 

0.6 

= -0.02933 


= A [0.91393 - 6 X 0.93941 + 18 X 0.96079 - 9.7775 - 

0.6 

3. X 0.99005] = - 0.02933 

By direct differentiation, dy/dx = —0.0293325. 

Problem. Use the data of Table 5 to find dy/dx and d 2 y/dx 2 at x « 0.75 by the 
methods of secs. 13.6 and 13.7. 


NUMERICAL INTEGRATION 

13.8. Introduction. — Suppose /(z) is known to be continuous over an 
interval of x from a to b but that either the explicit form of f(x) is unknown 
or it is such a function that its definite integral cannot be determined 


6 Rutledge, G., Phys. Rev . 40, 262 (1932). 



13.9 


NUMERICAL CALCULATIONS 


474 


conveniently in terms of other known functions. Numerical evaluation of 
such integrals, a process called approximate quadratures, depends on replac- 

J -%b pb 

f(x)dx oy another integral I </>(x)dx where <t>(x) can 

a J a 

be determined in a simple way. If f(x) is known to have the (n + 1) 
values y 0 , yi, • ■ -, y n at (n + 1) points within the interval (a,b), the latter 
integral may be expressed as 

pb 

I 4 >(x)dx — Aoyo + ^4-iZ/i + • • • + A n y n (13-10) 


where the (n + 1) quantities A m are independent of the (n + 1) values 
of the y m . It follows that if f(x ) is a polynomial of degree < n, the error 

made in replacing f f(x)dx by J^A m y m may be made to vanish by the 

a 

proper choice of the A m . If /(z) is a polynomial of degree > n, the differ- 
ence between the true value of the integral and (10) may still be small 
enough to make this procedure useful. We first consider the methods 
where the y m are known at equal intervals. 

13.9. The Euler-Maclaurin Formula. — If the explicit form of f(x) is 
known and it has finite derivatives at the upper and lower limits of the 
integral or if these derivatives may be determined by numerical methods, 
the Euler-Maclaurin formula may be used to evaluate the integral. Indi- 
cating the values of f(x) at x = a and at x = b by y 0 and y n and the inter- 
mediate values by yi, t/2, 2/3, * ■ •, this formula is written 

f.™* -» [f +«+» + - + ?]- 5, hf - 

(13-11) 

where y„ r) and yo ] are the r-th derivatives of f(x) at the points b and a. 
The numerical coefficients B r are the Bernoulli numbers, defined by the 
relation 


x 


e x - 1 


£ Bn%n 

n—0 n\ 


Q3-12) 


which may be rearranged to give the identity 


y S" f BnX n _ 

n-o(n + 1 )! 71 = 0 nl 


Successive values of B r are obtained from this equation by equating the 
coefficients of equal powers of x to zero or more simply as follows. Expand 
the equation 


(B + l) n - B n 


(13-13) 




TABLE 3 


X 

i/* 

A 

A 2 

A 3 

A 4 

1.0 

1.000000 





1.2 

0.833333 

-*166667 




1.4 

0.714286 

— 119047 

+47620 



1.6 

0.625000 

- 89286 

+29761 

-17859 


1.8 

0.555556 

- 69444 

+19842 

- 9919 

+7940 

2.0 

0.500000 

- 55556 

+13888 

- 5954 

+3965 


Example 6. Divide the interval between 1.0 and 2.0 into five equal 

2.0 

(dx/x) by the Euler-Maclaurin 

1.0 

formula. The required values of f(x) — l/x are given in Table 3. We 
also need the derivatives of odd order which are 

/(*)-i/*; P n) W = ~£r 

X 

Then 1) = -1; /'"( 1) = -6; / v (l) = -120; /(2) = -0.25; 

2) = -0.375; / v (2) = -1.875. Since h = 0.2, we also find h 2 / 12 = 
0.003333; A 4 / 720 = 2.222 X 10~ e ; A 6 /30240 = 2.4 X 10“ ®; A = 0.75; 
A 3 = 5.625; A 6 = 118.1; It — 0.695635. Hence, 

I BU = 0.695635 - 0.002500 + 0.000012 - (2.8 X 1<T 7 ) 

= 0.693147 


7 Those with even subscript only are required in the Euler-Maclaurin formula: 


13.11 


NUMERICAL CALCULATIONS 


476 


The fifth derivatives contribute to the final result only after six significant 
figures. A more exact value of the integral is In 2 = 0.69314719. 

13.10. Gregory’s Formula. — In case the explicit form of f(x) is un- 
known, we may rewrite (11), using (8) in place of the derivatives and obtain 
Gregory's formula 

f f(x)dx = I Gr =h jjs + yi + y 2 -+ f 

- ^ ( A Vn- 2 + AVo) 

19/l O o 3/l 4 A 

~ ^ (A 3 2/n _ 3 - A 3 2to) - — (A 4 ^„_4 + A 4 y 0 ) 

(13-15) 

It should be observed that the contents of the parentheses are alternately 
differences and sums. Additional coefficients of — h(A r y n -_ r ± A r y$) may 
be found by evaluating the definite integral 

(r ( ~ 1 | ) , f *(* - 1) (2 - 2) • • • (2 - r)dz (13-16) 

Example 7. Evaluate the integral of Example 6 by means of Gregory's 
formula. We find hf 12 - 0.01667; h/ 24 = 0.008333; 19A/720 = 0.0053; 
3A/160 = 0.0038. Hence, 

Tor = 0.69635 - 0.01667 (-0.055556 + 0.166667) 

- 0.008333(0.013888 + 0.047620) - 0.0053 (-0.005954 + 0.017859) 

- 0.0038(0.003965 + 0.007940) = 0.693163 

The result is not as precise as that obtained in Example 6 because of the 
small number of available differences. 


Problem. Evaluate the integral of Example 8, sec. 13.11, by the Euler-Maclaurin 
and Gregory formulas. Divide the interval into five equal parts. 

13.11. The Newton-Cotes Formula.— Instead of using differences, it 
is possible to rewrite (11) or (15) in terms of the y m since the A r y m may be 
reduced to sums of y m by means of (1). The resulting equation, called the 
Newton-Cote i s formula is of the form of (10) where 


(~ 1 r^h r n z(z-l)(z-2)>--(z-n) ^ 
m!(n — m ) ! J 0 (2 — m) 


(13-xO 


Table 4 gives the A m for several values of n . Tie values found in this 
way may be easily checked since it is necessary that 

Ao + + ^2 + • * * + A n = nh 


(13-18) 



477 


THE NEWTON-COTES FORMULA 


13.11 


When this method is used, it is simpler to divide the interval from a to b 
into a number of sub-intervals. The number of y m ’s in each of these deter- 
mines the appropriate A m from Table 4. The value of the integral then 
equals the sum of the separate terms obtained by applying (10) to each sub- 
interval. We give a few special cases of the Newton-Cotes formula. 


TABLE 4 


71 = 1 

Ao ~ Ai = h/2 

2 

A„ = At = h/ 3; Ax - 4h/3 

3 

A 0 = A 3 = 3/i/8; A 1 ~ A 2 ~ 9/i/8 

4 

A 0 = A 4 = 14/i/45 ; Ai = Aj = 64?i/45; At = 8A/15 

5 

A* = At = 95A/288; Ax = A 4 » 125/i/96 

A 2 = A 3 = 125/i/144 

6 

At = At = 4U/140; ^ = 54Zi/35 

A 2 = A 4 = 27/i/140; = 204/t/105 


a. The Trapezoidal Rule. If each sub-interval contains two values of 
y m ,n = 1, A 0 = A x = h/ 2 and 8 

It = £ f(*)dx = h ^ + Vl + y 2 + • • • + (13-19) 

This result is exact if the first differences of f(x) are constant. It will be 
noticed that (19) forms the principle term in both the Euler-Maclaurin 
and Gregory formulas. 

b. Simpson’s Rule. If there are an even number of y m and we divide 
each sub-interval in two parts, we obtain, with n = 2 from Table 4, 
Simpson’s One-Third Rule : 

Is = J Kx)dx = 

^ [2/0 + 4(2/1 + 2/3 + • ' ' + 2/n-l) + 2(?/2 + 2/4 + - • * + 2/n — 2 ) + 2/n] 

(13-20) 

This is exact if second differences of f(x) are constant. It is probably the 
most generally useful of all quadrature formulas. 

8 In order to avoid confusion, it should be noted that n has been taken with two 
meanings. In Table 4, it refers to the number of intervals between the lower and upper 
limits of the integral. It now refers to the number of divisions of the sub-intervals. 
As a subscript in (19), (20) and (21) it indicates the last available value of y m as in 
previous equations. 



13.11 


NUMERICAL CALCULATIONS 


478 


c. Weddle’s Rule. Taking n — 6 from Table 4 and neglecting all 
differences above the sixth, we obtain Weddle’s Rule: 

r h 3 k 

Iw = J f(?)dx = — [yo + 5?/i + V 2 + %3 + Va + 

+ 2 1/6 + %7 + Vs + fyv + 2/10 + %11 + 2^/12 + * * • 

+ 2^/n — 6 + %n— 5 + Vn-A + tyn-Z + Vn-2 + fyn - 1 + 2/n] (13-21) 


This is the most accurate of the formulas 9 in this section but it has the dis- 
advantage that the interval must be divided into a number of parts equal 
to six or some multiple of it. 

Various other special cases of the Newton-Cotes equation may be 
developed. The best known of these, generally called Simpson's Three- 
Eighth's Rule ia obtained from (10) and Table 4 with n = 3. As shown 
by Scarborough (loc. cit.), it is inferior to the One-Third Rule and should 
never be used. 


J ,i.5 x o 

— - dx by the three preceding methods. 

o ^ 1 

This integral is of importance in the Debye theory of the heat capacity of 
solids ; 10 it cannot be evaluated in terms of other known functions. Values 
of the integral between 0 and n, with n from 0.01 to 24 in steps of 0.01 have 
been given to six places by Beattie; 11 from his table, I = 0.615495. 
Dividing the interval 0 to 1.50 into six equal parts, we obtain Table 5. 
Since h = 0.25, we find 


I T = 0.25 X (1.991643 + 0.484678) 
= 0.619082 


0 25 

I s = “r- X (4 X 1.216979 + 2 X 0.774664 + 0.969357) 

o 

= 0.615550 
3 X 0 25 

I w = X (5 X 0.839293 + 1.744021 + 6 X 0.377686) 

= 0615495 

9 Note that the last term in (21) has the coefficient unity if n = 6 or some multiple 
of 6. In deriving this formula, the coefficient of the term A*yo is 41/140, which is taken 
to be 3/10 in order to make the final form of the equation as simple as possible. The 
resulting error is negligible. 

10 See, for example, Taylor, H. S. and Glasstone, S., “ A Treatise on Physical Chem- 
istry”, Vol. 1, Third Edition, Chapter IV, D. Van Nostrand Co., Inc., New York, 1942. 

11 Beattie, J. Math. Phys. 6, 1 (1926). 



479 


GAUSS’ METHOD 


13.12 


It is thus seen that the trapezoidal rule is the least accurate of these three 
equations while Weddle’s rule and Simpson’s rule give nearly the same 
results. 


TABLE 5 


X 

/(*) = *7 (e* - 1) 

0 

0 

0.25 

0.055013 

0.50 

0.192687 

0.75 

0.377686 

1.00 

0.581977 

1.25 

0.784280 

1.50 

0.969357 


Problem a. Compute some of the coefficients of Table 4. 

Problem b. Divide the interval of the integral of Table 5 into twelve equal parts 
and perform the integrations by the three methods of this section. 


13.12. Gauss* Method. — The method of Gauss not only determines the 
(n + 1) values of A m but also fixes the (n + l)y w ’s of (10) in such a way 

that the difference between J ' (j>(x)dx and f f{x)dx is a minimum. Since 


there are now (2 n + 2) constants available, it follows that if f(x) is a poly- 
nomial of degree < (2 n +1), the method will give an exact result for the 
integral. It will be remembered that the Newton-Cotes method will be 
exact under similar conditions, if the degree of the polynomial < n, hence 
Gauss’ method will give a more nearly exact result than the Newton-Cotes 
method with the same number of values of y m , or conversely the Newton- 
Cotes method requires a larger number of known values of the function 
than Gauss’ method for the same allowed error. This is a matter of some 
importance especially when the given values of y m are limited in number 
as they are likely to be when they result from experimental measurements. 

In applying the method, it is convenient to change the limits of the 

integral J f(x)dx by making the substitution 

x = a + (b - a)v (13—22) 


hence in terms of the new variable v, the limits are 0 and 1. Then, 
f(x)~f[a+(b-a)v]=F(v) 
dx - (h - a)dv (13-23) 


and 


I G = f f(x)dx * (h — a) f F(v)dv 

J a J 0 


(13-24) 



13.12 


NUMERICAL CALCULATIONS 


480 


Developing F(v) in a series similar to (10), we have 

I 0 = RqFq + RiFi + R 2 F 2 + • • • + R n F„ (13-25) 
where F m means the numerical value of F(v m ). Now it can be shown 12 
that the difference between J f{x)dx and I a of (25) is made a minimum 
provided v m and R m are determined by the relations 

22 Rm = 1 i 22 Rnfim = 2 i £ RmV m — ’J \ • • • 

m **0 =0 m^O 

£ RmV r m = 7^77 (13-26) 

m-0 \T “T AJ 

Since there are (2 n + 2) constants to be evaluated, the most direct pro- 
cedure would be to solve simultaneously (2 n + 2) equations like (26). 
This, however, is very laborious even for small values of n but the v m alone 
may be found in the following way. Let z 0 , z\, z 2 , • z n be the (n +- lV 
real roots of the Legendre polynomial 13 P n+i of degree (n + 1) obtained \ 
from the equation P n +i (2) = 0. Then, 

Wo — ^(1 + z o) ; V] = 1(1 + z i). • • •> »» = 1(1 + Zn) (13-27) 

With the (n + 1) values of v m determined in this way, it is a simple matter 
to find the remaining constants R m , (n + 1) in number for it is only neces- 
sary to solve simultaneously (n + 1) relations like (26). Values of both 
v m and R m are given in Table 6. 14 

Some writers make the substitution 

(& + b) (b — a) 

X = 2 + 2 W 

which changes the limits of the integral to dbl. In this case, 

(b — cl'] b (X ^ 

to = — 7 — / g{w)dw = — — 22 T m g{w m ) 

where g(w) corresponds to the former F(v) and 
T m = 2R m ; w m = 2v m - 1 

12 The proof is given by Hobson, E. W., “ The Theory of Spherical and Ellipsoidal 
Harmonics,” Cambridge Prees, 1931. 

13 See sec. 3.3. 

14 More extensive lists may be found in “ Tables of Lagrangian Interpolation Co- 
efficients,” Columbia University Pres9, New York, 1944. 



481 


REMARKS CONCERNING QUADRATURE FORMULAS 


13.13 


It will be observed that in Gauss’ method, the interval is not subdivided 
equally as in the preceding cases but it is divided symmetrically about the 
mid-point. 


TABLE 6 

n = 2 

t/o = 0.11270166 

v\ = 0.5 

v 2 = 0.88729833 

< 

ii 

ft} 

II 11 

c 

0- 

3 

t/o = 0.06943184 
vi = 0.33000948 
i/ 2 = 0.66999052 j 
t/ 3 * 0.93056816 

Ro — R$ = 0.17392742 
Ri = Ri = 0.32007258 

4 

t/ 0 = 0.04691008 
©i - 0.23076534 
i/2 — 0.5 

1/3 = 0.76923466 

Vi = 0.95308992 

R 0 = Ri = 0.11846344 
Ri = R 3 = 0.23931434 
Ri = 0.28444444 

5 

1/0 « 0.03376524 
vi = 0.16939531 

1/2 = 0.38069041 
i/ 3 = 0.61930959 

1/4 = 0.83060469 
i/ 5 = 0.96623476 

Ro = Rs, = 0.08566225 
Ri = Ri = 0.18038079 
Ri = R } = 0.23395697 


Example 9. Apply Gauss’ method to the integral of Examples 6 
and 7, subdividing the interval into four parts. From (22) and (44), we 


find x = 1 + v and I 0 = 


J* F(y)dv. 


From Table 6, with n = 3, we obtain 


F 0 - 1/1.069432 = 0.935076; F x = 1/1.330009 - 0.751875; F 2 - 
1/1.669990 = 0.598806; F 3 = 1/1.930568 = 0.517982. Then, 


I Q = 0.173927 X (0.935076 + 0.517982) + 
0.326072 X (0.751875 + 0.598806) 

= 0.693145 


The result is as precise as that obtained by the Euler-Maclaurin or Gregory 
formulas but entails much less work. 


Problem a. Find values of v m and Rm for n = 2. Hint: P n +i (z) = 


d 3 (z 2 - l) 3 
dz 3 


= 0. 


Problem b. Evaluate the integral of Example 6, sec. 13.9 by Gauss' method. 
Use the limits 1.0 and 3.0, subdividing this interval into four divisions. 


13,13. Remarks Concerning Quadrature Formulas. — The selection of 
the most suitable quadrature formula to use in a specific case is a matter for 
which no general rules can be given. When the explicit form of f(x) is 



13.14 


NUMERICAL CALCULATIONS 


482 


known and the differentiations easily made, the Euler-Maclaurin formula 
has the advantage of giving a result to any required number of figures. 
When the explicit form of f(x) is not known or if it cannot be differentiated 
easily, Gregory’s formula is useful. As previously stated, the Newton- 
Cotes formula and its special cases such as the trapezoidal rule, Simpson’s 
and Weddle’s rules are approximations to the Euler-Maclaurin and Gregory 
formulas; they have the advantage of requiring less labor to apply than 
the two former but result in a loss of accuracy. Gauss’ method is appar- 
ently not used as often as might be expected in chemical and physical 
calculations. Since calculating machines are commonly used in such work, 
the application of it is not laborious and the resulting precision should 
recommend it. 

The reader should remember that in approximate quadratures, the 
integrand is being replaced by a polynomial, the latter instead of the origi- 
nal function then being integrated. It thus follows that the reliability of 
the result is determined by the fidelity with which the approximating poly- 
nomial matches the given function. Since Gauss’ formula fits a poly- 
nomial of given degree with fewer known points than any of the other 
formulas, it should be preferred when the function is of such a form that it 
can be used. Even if the explicit form of /( x) is unknown, Gauss’ for- 
mula may still be applied but it requires interpolation between the given 
y m to find the proper F(v ). When the y m are the results of experiment and 
can be arranged at will, Gauss’ formula in fact prescribes their optimum 
positions as those determined by the v m . 

One caution regarding quadrature formulas should be mentioned. If 
the graph of /( x) is such that the area under one portion of the integral is 
much larger than that under another portion, the integral should be 
evaluated separately for each area. The value of h for the sub-interval 
contributing the least amount to the final result may then be taken as a 
larger quantity than the /i-value for the remaining sub-intervals. If 
nothing is known of the behavior of f(x), a graph should always be drawn. 


NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 

13.14. Introduction. — One often encounters differential equations which 
cannot be solved by any of the methods described in Chapter 2, except 
that of solution in terms of a series, and this method may be difficult to 
apply in certain cases. Even when an analytical solution is available, it is 
sometimes not easy to find numerical values of corresponding pairs of the 
dependent and independent variables. For example, if the initial con- 
ditions x 0 = 0, yo 5 = 1 are given for dy/dx = (y — x)/(y + x), the solu- 
tions is \ In ( x 2 + y 2 ) + tan” 1 yjx = 7t/2 but the labor of finding values 



483 


THE TAYLOR SERIES METHOD 


13.16 


of x for given values of y will be very great. In cases of this kind, it is 
possible to proceed by graphical 15 or numerical methods. The object of 
the latter is to obtain a table of x and y over the range of x required by 
die particular problem at hand. When a few such values are known, the 
table may be extended rather easily, as will be shown. Special methods 
are required for finding the first few values of x . We present four different 
ways of starting the solution of a differential equation by numerical meth- 
ods, and then show how the solution may be continued by extrapolation. 

13.16. The Taylor Series Method. — Suppose a differential equation 
of the first order is given : 

T- = f(x,y) (13-28) 


with initial values x = x 0 , y — yo- We may then write the Taylor series 


y = 2/0 + (x - x 0 )y'o + 


(s — Sp ) 2 
2! 


Va + 


( X - Zp) 3 

3! 


Vo" 


+ • • • + 


(x - So)" M 
n\ Vo 


(13-29) 


If it is possible to find the various derivatives, the calculations may be 
extended to as many values of x as desired. 

Example 10. Start the solution of the differential equation 



(13-30) 


with initial conditions, x G = 0, y 0 - 0. The exact solution of (30) is 
found by the methods of Chapter 2 to b ey = xe~ x ; the reader will recog- 
nize that it is of the form of the differential equation occurring in the study 
of radioactive disintegration and in the kinetics of chemical reactions 
involving consecutive first order decompositions. Since y' = e - * — y, 
y" = —e~ x — y', • • •, y (n) = (~l) n ~ 1 e~ x — j/ (n-1) , it follows that t/£° = 
( — l) n— ^ l n and from (29), 


2,^_ — 4- — _ — 4- 

y x x+ 2 6 + 24 120 + 


16 For graphical methods, see Levy, H., and Baggott, E. A., “ Numerical Studies 
in Differential Equations ; ,, Vol. 1, Watts and Co., London, 1934, or Sherwood and 
Reed, “ Applied Mathematics in Chemical Engineering,” McGraw-Hill Book Oo., 
New York, 1939. 



13,16 


NUMERICAL CALCULATIONS 


484 


Taking x as 0.1, 0.2 and 0.3, we find the results which appear in Table 7. 

TABLE 7 


X 

y 

Exact 
Values of y 

0 

0 

0 

0.1 

0.0905 

0.09048 

0.2 

0.1637 

0.16375 

0.3 

0.2222 

0.22224 


While the method is very simple, it is often tedious to apply as the 
successive derivatives may become difficult to handle and even at x = 0.3 
in this case, the fifth derivative is needed. However, it would appear that 
this procedure is preferable to any other in finding the first few values of y 
when it is possible to use it. 

13.16. The Method of Picard (Successive Approximations or Itera- 
tion). — From (28), we see that a solution may be found in the form of an 
integral equation 


y = yo + 


/ f(x,y)dx = y 0 


An approximate solution of this equation may be made by assuming that 
y - y 0 under the integral sign. The integral may then be evaluated (by 
quadratures, if necessary) since it is only a function of x and the constant 
y 0 . Denoting this first approximation to y by l y, 


'y = y o + 


/: 


f(x,y 0 )dx 


The process may be repeated to give 


(13-32) 


and so on. 

Example 11. 
method. 


2 y = y o + 



(13-33) 


Start the solution of Example 10, sec. 13.15 by this 


i 


J/i = Vo + I (e * - y 0 )dx 



e~*dx = 1 



485 


THE MODIFIED EULER METHOD 


13.17 


V = yo + f (e * - l y)dx 

Jo 

= f (2e~ x - l)dx = 2(1 - e - *) - X 


With a: = 0.1, 3 yi = 0-0906 ; the next approximation, S/i is the same as 
3 t/i, hence we proceed to calculate y 2 at x = 0.2 from the relations 



= *Vi + e“° a + 0.1 - e~ x - 0.0906a; 

= 1.0045 - e~ x - 0.0906x 

2 V2 = v + r (e*" x - 

•'o.i 

3 y 2 = z yi + r*(r* - 2 </ 2 )d* = QT639 

The next value, 4 y 2 is the same as s y 2 so we go on in the same way to find 
l y z , etc., at x = 0.3. The results by the Picard method are seen from 
Table 7 to be not quite as good as those obtained in Example 10. Moreover 
the disadvantages here are similar to those of the method of sec. 13.15, foi 
the successive integrals may become more and more difficult to determine. 

13.17. The Modified Euler Method. — If the intervals between succes- 
sive values of x are small enough we may write Ax = h and 

Ml) 4 * <13 ‘ 34) 

An approximate value of y\ at xi = x 0 + h is then given by 

fyi — 2/o + &V — 2/o + ^ (13-35) 

An approximation to dy/dx at xi, may be obtained by the relation 

1 (i) 1 =/( * 3,lyi) (13_36) 

which leads to an improved value of ?/i 



(13-37) 



13.18 


NUMEEICAL CALCULATIONS 


486 


This in turn will give a better approximation to dy/dx 

(!) = /(*i , 2 V\) (13-38) 

which may be used to compute the third approximation to y\. The process 
is repeated until there is no further change in the results. The values of y 
and dy/dx at x 2 are found in a similar way. 

Example 12. Start the solution of (30) by this method. Since 
£o = Vo — 0, (dy/dx) 0 = 1. With h = 0.1, 

Vi = i x o.i = o.i 

1 {dy/dx) I = e -0 ' 1 - 0.1 = 0.8048 
V = 0.1 (1.0 + 0.8048)/2 = 0.0902 
2 {dy/dx) i = 0.9048 - 0.0902 = 0.8146 
Si = 0.1 (1.0 + 0.8146)/2 = 0.0907 
3 {dy/dx)i = 0.9048 - 0.0907 = 0.8141 
Si = 0.05(1.0 + 0.8141) ■= 0.0907 

No further improvement results by continuing the approximations, so we 
proceed to x = 0.2 with y\ = 0.0907, (dy/dx) i = 0.8141. Then, 

Hto = 0.0907 + 0.8141 X 0.1 = 0.1721 

1 (dy/dx) 2 = e~°- 2 - 0.1721 = 0.6466 

and linally, 

Si = 0-1641 , 3 (dy/dx) 2 = 0.6546 

This method is tedious in application but perhaps less complicated than 
either of the preceding methods since neither differentiation nor integra- 
tion is required. 

13.18. The Runge-Kutta Method.— In this method it is necessary to 
calculate the four quantities 

h = f(xo,yo)h 

h = f(x o + ^ , Vo + t~) h 

h = f(x o + ^ , yo + ^jh (13-39) 

*4 = f(x o + h, 2/ 0 + k 3 )h 


/ 


Then, 


xi = x 0 + h, 2 /i = 2/0 + &y 
— •^■(fci + 2k 2 + 22c 3 + & 4 ) 


(13-40) 



487 


CONTINUING THE SOLUTION 


13.19 


It will be noted that if f(x,y) is independent of y, (40) reduces to Simpson's 
rule. The same formulas are used to compute y at x 2 , substituting x x and 
2 /i for x 0 and t/ 0 in (39). 

Example 13. With the same differential equation as before (Example 
10, sec. 13.15), we find 

h - (<T*° - y Q )h - 0.1 
k 2 = (e-°* 05 - 0.05)0.1 - 0.0901 
fc 3 = (<r°; 05 - 0.0450)0.1 * 0.0906 
fc 4 = (e' 0 * 1 - 0.0906)0.1 - 0.0814 

Hence, 

2/1 = 2/0 + Ay = £(0.1 + 0.1802 + 0.1812 + 0.0814) « Q.Q9Q5 


For the next interval, we find in a similar way, k x = 0.0814, k 2 = 0.0730, 
k z = 0.0734, * 4 = 0.0655, Ay - 0.0733, y 2 - 0.09055 + 0.0733 == 0.1638 . 

The error in the Runge-Kutta method is of the order of h 5 . It will be 
seen that its use is reasonably simple; it is probably the most generally 
useful of the four methods given here. 

13.19. Continuing the Solution. — When the first few values (three or 
four) of y have been found by one of the preceding methods, the solution 
may bAcontinued by extrapolation. For this purpose, it is appropriate 
to use Newton's interpolation formula (4), rewriting it in terms of. 
y f ~ dy/dx, y k and the differences Ay' k „ u A 2 y' k „ 2 , • * A r y' k _ r > Upon 
substituting this expression in the equation 


V = 



(13-41) 


and performing the integration, several useful formulas may be obtained 
by changing the limits of the definite integral. 

= h{y k + %Ay k ~i+^A 2 y k ~2+%A*y f k „ z +$^A*y k ^ A ) (13-42) 

(Ay)^x = h{y k — \Ay k ^ x — -^A 2 y k ^ 2 --^A z y k ^ z —Y^oA 4 y k -4) (13-43) 

(Ay) k Z 2 — — + + (13-44) 

— My* *“ %Ay k _ i + i^-A 2 2/it~2 — %A 3 y k - Z — v^T^Vfc- 4 ) (13-45) 

(Ay) k zl = h(yi'~%Ay k -i + ^A 2 y k -2~^A 3 yl~3 + ^TfA*yk-4) (13-46) 

The meaning of a symbol such as (Ay)** 1 should be clear. It is the incre- 
ment to be added to the Jc-th value of y in the difference table to obtain the 
next value beyond, that is, the value of y at x k + x . Equation (42) is thus 
to be used for extending the table to larger values of x while the remaining 
formulas are useful in checking the values of y already found. 



13.19 


NUMERICAL CALCULATIONS 


488 


Example 14. Extend the integration of the differential equation of 
Examples 10, 11, 12 and 13, using the values of y f found in Example 12. 
We first collect the data as shown in Table 8. To check y at x = 0.1, 
let us use (44). Then since k = 3, 

(A y)l = 0.1(0.6546 + f 0.1595 + T \0.0264) = 0.0905 

Thus, y\ = y 0 + (At/) o = 0.0905, which shows that the result in Table 8 
is in error by 2 units in the last place. Similarly, to check y 2 , we use (43) 
to obtain 

(A y)i = 0.1(0.6546 + 0.0798 - 0.0022) = 0.0732 

and 

y 2 = t/i + (Ay)? = 0.0905 + 0.0732 = 0.1637 

We now make a new table (Table 9) to include our corrected values of 
y> y f , At/, etc. To find t/ 3 , we use (42) to obtain 

(Ay)| - 0.1(0.6550 - 0.0796 + 0.0110) = 0.0586 
2/3 = 0.1637 + 0.0586 = 0.2223 
A check on t/ 3 may be found from (43) 

(Ay)| = 0.1(0.5185 + 0.0682 - 0.0011 + 0.0005),= 0.0586 


TABLE 8 


X 

y 

V 

Ay' 

aV 

0 

0 

.1.0000 



0.1 

0.0907 

0.8141 

-0.1859 


0.2 

0.1641 

0.6546 

-0.1595 

| +0.0264 


TABLE 9 


X 

y 

Ay 

y' 

Ay' 

aV 

aV ■ 

0.00 

0 


1.0000 




0.10 

0.0905 

0.0905 

0.8143 

-0.1857 



0.20 

0.1637 

0.0732 

0.6550 

-0.1593 

+0.0264 


0.30 

0.2223 

0.0586 

0.5185 

-0.1365 

+0.0128 

-0.0136 


Since this is the same result as that found previously, we proceed to the 
next value of x. Moreover, since the preceding y was correct at the first 
trial, we suspect that the value of h might be increased, say to 0.20. We 
thus obtain y for x = 0.40 in the same manner as before, th&n rewrite the 
table for x = 0, 0.20 and 0.40. From the new table, we go on to x = 0.60, 
etc. 














489 


SIMULTANEOUS DIFFERENTIAL EQUATIONS 


13.21 


13.20. Milne’s Method. — One further method of continuing the solu- 
tion of a differential equation is often useful. Supposing the first four 
values of y and y' have been found by some of the previous methods, we 
continue as follows. 

1. Find a first approximation to the next y by using the formula 

4/i 

l yt = yk - 4 + — (2 vU - vU + 2</iU) (13-47) 

2. Substitute this in the original differential equation (28) to find y 

3. Use the value of y' k to calculate 2 y k from the relation 

2 y k = y k - 2 + 2 (2/* + + ^£- 2 ) (13-48) 

If 1 y k and 2 y k agree to the desired number of figures, we may proceed to 
the next interval in the same way. If they do not agree, the size of the 
interval must be decreased. The error due to the use of (48) is 

E = 1 2 Vk ~ l Vk | 

•Eqs. (47) and (48; are obtained by integrating Newton’s interpolation 
formula (3), after expressing it in terms of y f . Both formulas are exact 
when fourth differences of y f vanish. 

Example 15. .Use Milne’s method to continue the solution of the 
differential equation of the previous examples. For x = 0.4, we find using 
Table 9 and (47), 

= — (2 X 0.5185 - 0.o650 + 2 X 0.8143) = 0.2681 
3 

From the original differential equation (30) 

yi = (0.6703 - 0.2681) = 0.4022 

From (48), 

2 Vi = 0.1637 + ^ (0.4022 + 4 X 0.5185 + 0.6550) = 0.2681 

Problem. Use the various methods of this chapter to obtain the solutions, cor- 
rect to four decimal places, of the differential equation dy/dx = (x - y) between 0 
and 0.25, with x 0 = 0, y 0 = 1. The exact solution is y = (s — 1) + 2e~ x . 

13.21. Simultaneous Differential Equations of the First Order.— Sup- 
pose the given equations are 

I "MW) 

dz 

T = fz(x,y,z) 


( 13 - 49 ) 



13.22 


NUMERICAL CALCULATIONS 


490 


where x is the independent variable and y , z are dependent variables. 
Provided initial values of x ) y and z are given, the first increments in y and z 
due to an increment Ax x in x may be found by any of the methods given in 
the preceding sections. The procedure should be obvious, but it is particu- 
larly necessary to check the results carefully at each stage of the solution. 
If the Runge-Kutta method is used, the following equations replace (39) 
and (40) 


h = /i(®o> 0 o. 2 o)A 
, / , h 

h 

*2 = fi ( • *o + 2> 

y o+ T’ 

, , ( , h 

ho 

*3 = /l 1*0 + 2> 

2/0 + 

&4 = /l (*0 + A, 2/0 + k 3 , Zq 

m = f 2 (XQ,yo,Zo)h 

, ( ,h 

= J2 1*0 + ", 

, *1 

2/0 + V’ 

J h 

, ho 

m z — /2I *0 + 2’ 

yo+ r 

= f 2 (x 0 + h, 

2/0 + 


Zq + 


*0 + 


20 + 


?)* 

T> 

h 

t)» 

t)‘ 


(13-50) 


Zq + 77lz)h 

xx = x 0 + h; 2/1 = 2/o + A y; z x = z Q + Az 

A y = -g-(&i + 2k 2 + 2 /C 3 + k 4 ) 

Az = -J(mi + 2m 2 + 2m 3 + m 4 ) (13-51) 

13.22. Differential Equations of Second or Higher Order. — Any differ- 
ential equation of second or higher order is reduced to a system of simul- 
taneous equations by the introduction of new variables. Consider the 
equation 




y 


(»- 1) 


) 


(13-52) 


where y' = dy/dx, y" = d 2 y/dx 2 , etc. Make the substitutions 


dy 

21 dx ; 22 dx ’ 


then, 


dz\ 
dc 

dZn — 1 


2n_l = 


dz, 


n — 2 


dx 


dfy 

dir" 2n dx 


— f — 1 ) 


(13-53) 

(13-54) 


Provided initial values of x, y, z h z 2 , • • •, 2 n ~i are given, the problem is 
equiva&nt to the solution of a system of simultaneous first order differential 
equations which may be effected as described in sec. 13.21. 



491 NUMERICAL SOLUTION OF TRANSCENDENTAL EQUATIONS 13.23 

In physical problems, differential equations of the type 
d 2 "' 


or 


dx 2+/(^) = 0 
d 2 y ( dy\ 

often arise, with the requirement that the variables satisfy certain bound- 
ary conditions, say x = x 0 , y — y 0 and x = aq, y = y { with the initial 
value of dy/dx unknown. For example, in the Thomas-Fermi theory 16 0 f 
the atom, the equation d 2 y/dx 2 = {y 2 /x) m occurs with the boundary con- 
ditions, x — 0, y — 1, x = co t y = 0, In cases of this kind, a tentative 
value of dy/dx is assumed and a rough integration is made over the range 
of x. This first approximation will usually suggest a better guess for 
the initial value of dy/dx. After several attempts are made, the value of 
dy/dx may usually be found to the desired accuracy. 

Example 16. Find y and dy/ dx for the equation 

d 2 y dy 

-? + 4^ — 4y = ° 


Let dy/dx = z, then the second order equation is equivalent to the first 
order equations 


dy 

dx 


= z; 


— +4X2 
dx 


±y = 0 


which may be solved by the previous methods. If the Runge-Kutta 
method is used, fi(x,y,z) = z and fo(x } y,z) = -4xz + 4 y. In this case, 
/1 does not depend on x and y } a situation which makes the evaluation of 
the k 1 s in (50) somewhat simpler than in the general case. The differential 
equation of this problem may be solved exactly by the substitution 
y = ve~ x \ 

PART 2. ALGEBRAIC CALCULATIONS 


13.23. Numerical Solution of Transcendental Equations. — No general 
method exists for finding the roots of transcendental equations such as 
xe x ~ 1 or x 2 = sin x. Approximate values may always be found by 
graphical means; where more precise results are required several analytical 
procedures are available. 

a. The Method of “ Regula Falsi ” Suppose the given equation ia 
f(x) = 0, then it is obvious that the plot of y = f(x) will give the required 

16 The differential equation and its solution are discussed in more detail by Gombas, 
t*., u Theorie und Losungsmethoden des Mehrteilchenproblems der Wellenmedhanik,” 
BirkhaUser, Basel, 1950. 



13.23 


NUMERICAL CALCULATIONS 


492 


root when y = 0, that is when the graph crosses the x-axis. Two values 
of x, say x 0 and xi with the corresponding values of y are selected from 
a graph or otherwise. Then if x 0 is near the root desired, a better approxi- 
mation for the root is given by 


where 


x = xo + Ax 
A (xi - x 0 ) | y 0 ! 

U.I + U.I 


(13-55) 


The process is continued until the required number of figures is obtained. 
Example 17. Find the solution 17 of J(x) = (5 — x)e x — 5 = 0 neai 

x = 5. One solution is clearly x = 0; to find the other let x 0 = 4.5, 

xi = 5.0; 2/0 = 40.00, yi = -5.00, hence 

= 0 5 45 qq ~~ = °’ 44; 1 a: = 4 - 50 + (I44 = 494 


A second approximation with x 0 — 4.94; y 0 = 3.382 gives 

2 Ax = °' 06 8 3 8 2 3 ~ 2 = °- 024 i ' 2a: = 494 + °- 024 = 4964 


The third approximation with x 0 = 4.964; ?/ n = 0.1516 gives 


3 Ax = 


0.036 X 0.1516 
5.1516 


0.001; 3 x = 4.964 + 0.001 = 4.965 


Further repetition of the calculations show that this result is correct to four 
significant figures. The value 4.965114 has been obtained by Birge. 18 

b. The Newton-Raphson Method. When the derivative of f(x) is easily 
evaluated numerically, the real roots of f(x) = 0 may be determined in the 
following way. Suppose z 0 is an approximate value of one of the roots, 
then an improved value of the root is given by 

f(x o) 

x = x 0 + Ax; Ax = - (13-56) 

/ C^o) 

The next approximation is found by substituting x in place of x 0 to get a 
new value of Ax, continuing in this way as long as necessary. In practice, 
it will be found that after a few approximations, the value of the derivative 
will change very little with succeeding values of x hence f need not be 
recomputed. 

17 This equation occurs in Ihe theory of black-body radiation, see, for example, 
Taylor, H. S., and Glasstone, S., “A Treatise on Physical Chemistry, ” Vol. 1, D. Van 
Nostrand Co., Inc., New York, 1942. 

18 Birge, R. T., Revs. Mod. Phys. 13 , 233 (1941 j. 



493 


SIMULTANEOUS EQUATIONS IN SEVERAL UNKNOWNS 


13.24 


Example 18. Find x of example 17, starting with Xo = 4.9. Sub- 
stitution gives f(x o) = 8.43; /'(x 0 ) = —120.87; Ax = 8.43/120.87 = 
0.07; l x - 4.97. The second approximation is obtained from /(4.97) = 
-0.677; f (4.97) = -139.78; Ax = -0.677/139.78 = -0.005; 2 x = 
4.965. 


c. The Method of Iteration. If we rewrite our equation /(x) = 0 in the 
form 

x = <t>(x) (13-57) 

we may substitute an approximate value of x, say x 0 on the right of (57) 
to get 1 x = 4>(xq) and repeat to get 

2 x = ^(xi); 3 x = tf>(x 2 ); etc * (13-58) 

It is often possible to write fix) = 0 in the form x = 0(x) in several 
different ways, in which case, it is better to start with the simplest such 
arrangement. A few approximations will indicate whether the chosen 
form is suitable but if the succeeding values of x do not converge rapidly, 
one of the alternative functions should be tried. The condition for con- 
vergence is found to be that the derivative of x, be less than unity in 

the neighborhood of the desired root. As this derivative becomes smaller, 
the convergence becomes more rapid. 

Example 19. Find x of the function in Examples 17 and 18 by the 
method of iteration. Writing the equation in the form x ~ 5e~ x (e x — 1) 
we find with xo = 4.9; c x = 134.3; l x = (5 X 133.3)/ 134.3 = 4.963. 
The next approximation gives e x ~ 143.1; 2 x = (5 X 142.1 )/143.1 = 
4.965. 

Problem. Solve the equation x log x = 1.5334 by the methods of this section. 
Ans.: x = 3,1110 . 

13.24. Simultaneous Equations in Several Unknowns.— The real roots 
of simultaneous algebraic or transcendental equations may be found by the 
methods of secs. 13.23b or 13.23o. In the Newton-Raphson method, 
when two equations are given 

f(x,y) = 0; g(x,y) = 0 (13-59) 

(56) is replaced by 

x — Xo + Ax; y - yo-t ^y 

where 

-f(x 0 ,y 0 ) f v (x 0 ,ya) 

-g(xo,yo) 9 v (xo,yo ) 

fx(xo>yo) ~f( x o>yo) 

gx(x 0 ,yo ) ~g(xo,yo) 

fx(zo,yo) fv( x O)Vo) 
gz(x 0 ,yo) g v (xo,yo) 




13.26 


NUMERICAL CALCULATIONS 


494 


In the method of iteration, we rewrite (59) as 
s = 4>{x,y)\ y = H^y) 

then 

l x = *(x 0 ,y 0 ); 

2 z = 0( x T,V); 2 y = t(Wy); etc. (13-60) 

Both methods are readily extended to cases of more than two unknowns. 

13.25. Numerical Determination of the Roots of Polynomials. — Any 
of the methods of sec. 13.23 may be applied to determine the real roots of a 
polynomial. When all of the roots are not required, the Newton-Raphson 
method is probably more rapid than the others. 19 In order to evaluate 
fix) and f{x) for x = x 0 , the following procedure will be found useful. 
Suppose the polynomial is y(x) = c 0 x n + CiX n ~ l + * • • + c n . Write the 
coefficients in a line, supplying zeros if any powers of x are missing. Multi- 
ply the number c 0 by x 0 and add the result to ci ; multiply this sum (d x ) 
by T 0 and add to c 2 continuing until the last sum is obtained; its value 
equals y(x) for x — x 0 . The scheme is illustrated in Table 10. In actual 
computation with a calculating machine nothing need be written down 
since with proper care to locate the decimal point and due regard to sign, 
the whole process may be performed as a continuous operation. The use 
of this method is illustrated further in the last part of Example 20. 


TABLE 10 


Co 

Cl 

C 2 

C 3 

Cn 


CqXq 

diXo 

d 2 xo 

' • d n ~iXo 


d 1 

d 2 

dz 

dn 


Graeffe's root-squaring method will be found to involve little more labor 
than the preceding method with the added advantage that it gives all of 
the roots of the polynomial at once. No initial approximation is required 
and complex as well as real roots may be found. It is convenient to divide 
by the coefficient of x n if necessary so that the polynomial appears in the 
form y(x) = x n + aiT n_1 + a 2 x n ~ 2 + • • * + a n - 0. Using detached co- 
efficients, Table 11 is calculated. Care must be taken with the signs of the 
doubled cross-products. The new coefficients b\, h 2 , • * *, are then squared 
and the cross-products of the 6’s determined in a similar way. As the 
squaring process is continued, it will be found that the doubled cross- 
products become progressively smaller, eventually contributing nothing to 
the next squared terms. When this point is reached, there will be n coeffi- 

19 Horner's method does not appear to have any advantages over the Newton- 
Raphson method. It is described by Mellor, J. W., “ Higher Mathematics for Students 
of Chemistry and Physics,” Longmans, Green and Co., New York, 1902, and in most 
elementary algebra texts. 



495 NUMERICAL DETERMINATION OF THE ROOTS OF POLYNOMIALS 13.26 


cients, say m lf ra 2 , • • *, m n . Then if X\, x 2j • • *, x n are the n real roots of the 
polynomial 


or, 


1 

1 , 

n m 2 

1 U m n 

i; | 

£2 |- 

P = ^ : 

, *» p - 

m n ^i 

log 

1 x l 

= - log m\ 

V 


log 

1 *2 

— - (log m 2 
V 

- log »h) 

log 

1 *S 1 

= - (log m 3 
V 

- log m 2 ) 

log 

1 x n 

\ = - (log m n 

V 

- log m n _ i) 


( 13 - 61 ) 


where p = 2 9 and 5 is the number of times the squaring operation has been 
performed. The signs of the roots must be determined by some rule of 
signs but this may often be done by inspection. 


TABLE 11 


1 

ai 

0-2 

az 

CL 4 


1 

a\ 

— 2 a 2 

4 

— 2a 1*23 
-f-2a4 

4 

— 2CL2&4 
— (-* 2ct 5 

4 

— 2a 3 a5 + 2a2a6 

— 2d 1(2-7 
+ 2(28 


1 

h 

b 2 

kz 

^4 



In practice, it is best to carry only four or five figures in the calculations, 
hence tables of squares and four-place logarithms may be used if a calcu- 
lating machine is unavailable. If more figures are required in the roots, 
the use of- the Newton-Raphson method serves both to give these addi- 
tional figures and to check the previous calculations. 

When two (or more) roots of the polynomial are real and equal, one of 
the doubled cross-products will not decrease in magnitude as the squaring 
proceeds; in fact it will always be equal to one-half of the squared term 
which stands just above it. The squaring in this case is stoppe^ when 
the other cross-products no longer contribute to the next coefficients. 

The presence of complex roots in a polynomial expression is revealed 
by the fact that the doubled cross-products do not disappear and the signs 
of some of the sums alternate as the squaring proceeds. The method of 
finding .the complex roots as well as pairs of real roots is described in detail 
by both Scarborough and by Whittaker and Robinson (loc. cit.). 



13.25 


NUMERICAL CALCULATIONS 


496 


TABLE 12 


1 

—5.600 X 10 

4.900 X 10 2 

1.111 X 10 4 

-1.175 X 10 6 


3.136 X 10 3 
-0.980 

2.401 X 10 5 
12.44 
-2.350 

1.234 X 10 8 
1.152 

1.381 X 10 10 

1 

2.156 X 10 3 

1.250 X 10 6 

2 386 X 10 s 

1.381 X 10 10 


4.648 X 10 c 
-2.500 

1.562 X 10 12 
-1.029 

0.028 

5.693 X 10 16 
-3.452 


1 

2.148 X 10 6 

5.610 X 10 u 

2.241 X 10 16 

1.907 X 10 2 ° 


4.614 X 10 12 
-1.122 

3.147 X 10 23 
-0.963 

0.004 

5.022 X 10 32 
-2.140 


1 

3.492 X 10 12 

2.188 X 10 23 

2.882 X 10 32 

3.637 X 10 4 “ 


1.219 X 10 25 
-0.044 

4.787 X 10 46 
-0.201 

8.306 X 10 64 
-1.591 


1 

1 . 175 X 10 25 

4.586 X 10 46 

6.715 X 10 64 

1.323 X 10 81 


1.381 X 10 60 

2.103 X 10 93 

4.388 X 10 129 

1.750 X 10 162 

1 

1.904 X 10 100 

4.414 X 10 186 

1.925 X 10 269 

3.062 X 10 324 


Example 20. Find the four real roots of the polynomial 20 

y{ x) = x* - 56x 3 + 490x 2 + 11,112a: - 117,495 = 0 

The method is apparent from Table 12. It will be seen that the second row 
of doubled cross-products may be neglected after the eighth power terms 
and the first row after the thirty-second power terms, hence the squaring 
is stopped after raising the coefficients to the sixty-fourth power. We then 
find that 

log | Xi | = 100.2797/64 = 1.5669 

log ] x 2 | = (186.6448 - 100.2797)/64 = 1.3494' 

log | x a | = (72.6396) /64 = 1.1350 

log 1 x 4 1 = (65.2016)/64 = 1.0188 

so that | Xj | = 36.89; | x<i | = 22.36; | X 3 j = 13.65; | X 4 | = 10.45. 

Inspection shows that all signs are positive except that of x 3 . With these 

20 Solution of similar equations is needed to calculate the energy levels of the asym- 
metric top in quantum mechanics; see, for example, Herzberg, G., “Infrared and Raman 
Spectra of Polyatomic Molecules/’ D. Van Nostrand Co., Inc., New York, 1945. 










497 NUMERICAL SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS 13.26 


values, the sum of the roots is 56.05, in approximate agreement with the 
coefficient of x 3 in the original equation. 

In order to improve these values, we make use of the Newton-Raphson 
method. With Xi — 36.89, we find 

y{x 0 = [(1 X 36.89) - 56] + [(-19.11 X 36.89) + 490] 

+ [( — 214.97 X 36.89) + 11,112] + [(3,181.76 X 36.89) 

- 117,495] = -120 


In the same way, from 

y'(x) = 4z 3 - 168a; 2 + 980a: + 11,112 

we find 

y\x i) = [(4 X 36.89) - 168] + [- (20.44 X 36.89) + 980] 


+ [(225.97 X 36.89) + 11,112] = 19,448 

Then, 

Axi * 120/19,448 = 0.0062 
and 

= 36.89 + 0.0062 = 36.8962 

Repeating the calculations, we obtain y( x xi) —'3.57; y\ l x i) = 19,478; 
A 1 ^ = —0.0002; 2 X\ — 36.8960 . This value is correct to five significant 
figures. The same procedure applied to the other roots gives 22.3410 ; 
— 13.6669 ; 10.4302 . The sum of these values which is 56.0005 gives a 
further check on the results. 

Problem. Find the roots of x 3 — 15x 2 + 74# — 120 = 0, by the Graeffe method. 
Ans.: x = 4, 5, 6. 


13.26. Numerical Solution of Simultaneous Linear Equations. — 
Systems of the form 

n 

Z a ki x k = gn (i = 1, 2, • • •, n ) (13-62) 

k=l 


where the a k i and g; are numbers and the x k are sought, often occur in 
physical problems, particularly in the solution of the normal equations 
resulting from a least squares treatment of numerical data (see sec. 13.37b). 
Several methods of solving such equations are given by Whittaker and 
Robinson (loc. cit.) but none of these are particularly suitable for machine 
calculation (see also sec. 10.9). When au = a ik , which is usually true, 
the determinantal method described there offers certain advantages but in 
general when the number of unknowns is greater than four or five the 
labor of evaluating the determinants becomes prohibitive. The following 


13.26 


NUMERICAL CALCULATIONS 


498 


systematic procedure 21 which is well adapted for machine calculation will 
b$ found useful in such cases. 

Using detached coefficients, the numbers in (62) are written down as in 
Table 13. For convenience we assume that there are only four unknowns; 


TABLE 13 


01 

02 

03 

- 

Gil 

012 

013 

— 022/021 

—023/021 

4 

022 

023 

1 

0 

^31 

032 

033 

0 

1 


02 

/ 

03 

- 

- 


bn 

&12 

— &22/&21 



4 

&22 

1 




it 





03 





C U 




extension of the method to a larger number may be made without difficulty. 
Choose some unknown, say x 2 for elimination. Divide the numbers of the 
corresponding row of (A) by the first number in that row (we indicate it 
with a star) and add one’s and zero’s as shown to form (jB). Now con- 
sider g X) g 2} 03 as a row matrix and multiply the columns of ( B ) by this 
row (see sec. 10.6). The results are g 2 and g' 3 . For example, g 2 = gi X 
'(— U22/ ^21) + 02 X 1 + 03 X 0 and g 3 = g\ X (—^23/^21) + 02 X 0 + <73 
X 1. Multiply rows of ( A ) by columns of ( B) } omitting the starred row 
of (A). This gives the numbers Again star an element and repeat 
the process until the last unknown is eliminated. The values of x are then 
given by 

Zi = g"/cn 

»2 = (gi ~ &11Z1V&21 

£3 = (gi - a u x 1 - a 31 x 3 )/a 2 i ( 13 - 63 ) 

Some care must be exercised in the order of elimination of the s’s, especially 
if they are of widely different magnitudes. It is always advisable to begin 
with the smallest one, proceeding with the elimination in order of increasing 
magnitude. If this is not done, the cumulative errors in the calculations 
will produce unsatisfactory values of the unknowns. 

21 See Frazer, R. A., Duncan, W. J., and Collar, A. R., “Elementary Matrices/’ 
Cambridge University Press, 1938; Jeffreys, H. and Jeffreys, B. S., “Methods of Mathe- 
matical Physics”, Second Edition, Cambridge University Press; 1950 and Milne, loc. cit. 



499 


EVALUATION OF DETERMINANTS 


13.27 


Example 21. Fit the data of Example 3, sec. 13.3 to an equation of the 
form e = Xi + X 2 < + t 2 . The three simultaneous equations become 

xi + 630.5x 2 + 3.975 X 10 5 x 3 = 5.535 

xi + 960.5x2 + 9.226 X 10®x 3 = 9.117 

x t + 1063.0x2 + 11.300 X 10 6 x 3 = 10.301 (13-64) 


TABLE 14 


5.535 

9.117 

10.301 

- 

- 

1 

630.5 

3.975 X 10 6 * 

1 

960.5 

9.226 X 10 5 

1 

1063.0 

11.300 X 10 5 

-2.32101 

1 

0 

-2.84277 

0 

1 


-3.72979 

-5.43373 

- 

- 


-1.32101 

-502.897* 

-1.84277 

-729.366 

-1.45033 

1 




-0.02430 

- 




+0. 07313* 




Since the magnitude of the x’s is probably x^ < x 2 < x % , we choose the 
starred numbers in that order. If we desire four significant figures in the 
final results; we note that we must carry six .figures in the calculations, 
since two figures disappear in one of the steps. The scheme is shown in 
Table 14. Then, 

x t - -0.02430/0.07313 = - 0.3323 
x 2 = -(-3.730 - 1.321 X 0.3323)/502.9 = 0.00829 
£3 - (5.535 + 0.3323 - 630.5 X 0.00829)/3.975 X 10 5 
- 1.611 X 1(T 6 

Substitution of these results in the original equations gives as a checR, 
5.535, 9.116, 10.300. 

13.27. Evaluation of Determinants. — The procedure just outlined is 
also applicable to the evaluation of determinants, the scheme being similar 
to that shown in Table 14 except for the fact that the g’s are omitted. If 
the starred elements are taken in the first column and row, that is, in the 
order a u , bn, cn, etc., the value of the determinant equals the product of 
all of these starred elements. If some other order is chosen as in Example 
21, the determinant still equals this product but it must be multiplied by 
( — l) n where n is the number of interchanges required to, bring the starred 
elements into the position of the element which stands first in the corre- 
sponding array. If it is convenient to choose starred elements that are not 
in the first column the necessary modification of the procedure will be found 
described by Frazer, Duncan and Collar (loc.cit.). A method for determi- 
nants, with suitable checking procedures, is also given by Milne (loc. cit.). 


13.28 


NUMERICAL CALCULATIONS 


500 


Example 22. Evaluate the determinant of the coefficients of the x’s 
in Example 21. Since two interchanges are required to bring the first 
starred element to the position a n and one interchange to bring 621 to &n, 
the value of the determinant A is given by 

A = (— l) 3 X (3.975 X 10 6 ) X (-502.897) X (0.07313) 

= 1.4619 X 10 7 

Problem. Evaluate some of the determinants of Example 23, sec. 13.28, The 
answers are found in Table 16. 

13.28. Solution of Secular Determinants. — In many quantum mechani- 
cal problems, it is necessary to find one or more roots of a secular equation 
(see sec. 10.14): 

y(\) = | atj ~ bijX I = 0 (13-65) 


dij = ctji, bij — bji ; iy j = 1, 2, • • •, N. In most cases, 5# = but even 
if this is not true in the original form of the determinant it is usually possi- 
ble to reduce (65) to this form by suitable addition and subtraction of rows 
and columns. We shall assume here that X occurs only in the diagonal ele- 
ments. The particular .method to be used in finding values of X depends to 
some extent on the special problem at hand. We present three methods, 
each of which has certain advantages. 


a. The Polynomial Method. When (65) is expanded, it obviously 
gives a polynomial of the JV-th degree in X. Once this polynomial is 
obtained, either of the methods of sec. 13.25 may be used to find values of X. 
Graeffe’s method is particularly useful when it is required to find all of 
them. To convert the determinant into the polynomial, its expansion 
may be effected by the usual method of reduction of its order (see sec. 
10.3) or by a very convenient procedure which has been described by 
Hicks. 22 

According to the latter method, we substitute X — 0, 1, 2, • • •, (N + 1) 
in the given determinant and evaluate each numerically. From these 
(N + 2) results, y 0) 2/1, 2/2, * * *, Vn + 1? & table of differences is formed as 
described in sec. 13.2. An immediate check on the computation of the 
determinants is available for the (. N + l)-st differences should vanish. 
The polynomial is then given by 


where 


yW = X pA 


£=■0 


(13-66) 


w 

Po = Po; Pi = E n»A ‘y 0 ; t ^ 1 
«=< 


22 Hicks, B. L., J. Chem. Phys. 8, 569 (1940). 


(13-67) 



501 


SOLUTION OF SECULAR DETERMINANTS 


13.28 


The coefficients r t3 are independent of the values of the elements in (65) 
and may be computed from the following relations: 

r is = ^ (13-68) 


c 4 (s) = 1; a(s + 1) = c,-_i(s) - sci(s ); c 0 (s) = 0 


where 


ci(s+ 1) = (-1) 9 *!, s > 1; c,_! (s) = 
The results may be checked by the identities 


s(l - s) 
2 


1=1 s! 


4^Cj(s) 

h si 


Values of the r is through t = s = 6 are given in Table 15. 


TABLE 15 


(13-69) 


Example 23. As an -example of the use of this method, we choose the 
secular determinant whose expanded form served as an example for the 
Graeffe method (see Example 20) . The determinant follows 

36 - X -4.062 0 0 

, _ -4.062 16 - X 8.216 0 

y{K) 0 8.216 4 - X 14.49 

0 0 14.49 - X 

Making the substitutions X = 0, 1, 2, 3, 4, 5 in turn and evaluating the 
determinants, we obtain Table 16. The fact that the fifth differences 


TABLE 16 


X 

y 

A 

A 2 

A 3 

A 4 

0 

-117,495 





1 

-105,948 

-1*11,457 




2 

- 93,743 

+12,205 

+658 



3 

- 81,180 

+12,563 

+358 

-300 


4 

- 68,535 

+12,645 

+ 82 

-276 

+24 

5 

- 56,060 

+12,475 

-170 

-252 

+34 


13.28 


NUMERICAL CALCULATIONS 


502 


vanish assures us that the determinants have been computed correctly. 
From (67), Tables 15 and 16, we find 

p 0 = -117,495 


658 300 

Pi = 11,547 - — = 11,112 

658 300 11 X 24 Aon 

^ 2 T 2 24 

300 24 

* — r - t * _56 


Va * 1 


hence the required polynomial is 

y(X) = X 4 - 56X 3 + 490X 2 + 11,112X - 117,495 
in agreement with the result given in Example 20. 

b. Matrix Method. A matrix method, described by Frazer, Duncan 
and Collar (loc. cit.) is sometimes useful. It gives the largest value of 
| X | only, but in quantum mechanical problems this is often all that is 
required. The method does not converge rapidly unless the largest root 
is widely separated from the remaining ones. The procedure is as follows. 
Set X = 0 in the secular determinant and multiply the resulting matrix by 
a matrix of one column. The latter is arbitrary but in its most convenient 
form it contains unity in one row and zeros in the other rows. Extract a 
constant scalar quantity from the resulting matrix product and multiply 
the original matrix with the new one-column matrix. Continue in the same 
way until the scalar quantity becomes constant. This is the required root 
of largest amplitude. 


Example 24. Find the largest root of the secular determinant of 
Example 23. The procedure is apparent from the following. 


36 4.062 0 0 

4.062 16 8.216 0 

0 8.216 4 14.49 

0 0 14.49 0 

For the next approximation, 

36 4.062 0 0 

4.062 16 8.216 0 

0 8.216 4 14.49 

0 0 14.49 0 


1 


36 


1 

0 


4.062 

= 36 

0.1128 

0 

— 

0 

0 

0 


0 


0 


1 


36.46 


1 

0.1128 


5.87 

= 36.46 

0.1610 

0 


0.93 

0.0256 

0 


0 


0 



503 


SOLUTION OF SECULAR DETERMINANTS 


13.28 


Continuing in the same way, we obtain the results in Table 17. The sixth, 
seventh, eighth and ninth approximations give 36.85, 36.87, 36.88, 36.88, 
hence X - 36.88 . Comparison with Example 20 shows that this result is 
uncertain in the last place. The convergence here is not very rapid since 
the next largest root is 22.341. More rapid convergence could be obtained 
by squaring the original matrix several times before commencing the matrix 
multiplications. The constant value so obtained is then some power of the 
desired root. Once having found the largest root, the next largest one may 
be obtained by the same method. Further details are given by Frazer, 
Duncan and Collar (loc. cit.). 


TABLE 17 

Successive Column Matrices 


Third 

Fourth 

Fifth 

36.65 

i 

36.76 

1 

36.81 

1 

6.85 

0.1869 

7.37 

0.2005 

7.68 

0.2086 

1.42 

0.0387 

1.84 

0.0500 

2.07 

0.0562 

0.37 

0.0101 

0.56 

0.0152 

0.72 

0.0196 


c. Iteration Method . Several iteration methods which do not depend 
on matrix properties have been described. 23 Crude approximations to the 
roots of the polynomial are given by the diagonal terms in the secular deter- 
minant. Suppose one of these values, say X 0 is substituted in the determi- 
nant for X in every place except one where the quantity X 0 — X occurs. 
Now if the determinant is evaluated, the resultant value of X is the next 
approximation to the true value. The process may be repeated as often 
as necessary. 

Example 25. Find a root of the determinant of Example 23 by the 
iteration method. Taking Xo - 36, the determinant becomes 

36 — X 4.06^ 0 0 

4.062 -20 8.216 0 

0 8.216 -32 14.49 

0 0 14.49 -36 

When this is evaluated, we obtain *X = 36.893. Substitution of *X in the 
original determinant gives 2 X = 36.896 . The third approximation gives 
the same result. 

Problem. Compute some of the coefficients of Table 15. 

23 See, for example, James and Coolidge, J. Chem. Phys. 1, 825 (1933), Cross and 
Crawford, J . Chem . Phys . 5 , 621 (1937). Another iterative method, which gives both 
the eigenvalues and the amplitudes for a system of honoganeous linear equations, has 
been described by W. Kohn, J. Chem. Phys. 17 , 670 (1949). 


13.29 


NUMERICAL CALCULATIONS 


504 


PART 3. ERRORS AND LEAST SQUARES 

13.29. Errors. — Measurements are always accompanied by errors. 
They are of two kinds: determinate and random . Those of the first type 24 
are often constant or systematic, being due to faulty or incorrectly adjusted 
instruments, mistakes on the part of the observer in reading a scale, record- 
ing a number or other similar effects. It is usually, possible to discover the 
causes of such errors and to make corrections for them. Random errors, 
on the other hand, are indeterminate and due to unknown causes, but they 
may be treated by statistical methods. As in the previous parts of this 
chapter, we shall often refer the reader to other sources 25 for proofs of 
theorems and results to be given here. 

Suppose several equally reliable measurements of a physical quantity 
yield the numbers X u X 2} * • *, X n . The corresponding errors are defined 

by 

x^Xi-X, x 2 = X 2 - X, ■■■, x n = X n ~X (13-70) 

where X is the true value of the quantity. Actually, we seldom know 26 
the true value since any experiment made to determine it will be accom- 
panied by random errors. However, in order to proceed further we must 
choose some quantity which is called the most probable value. It will be 
indicated by A r , the notation anticipating a fact that we prove in sec. 13.30, 
namely, the most probable value is the average of all the data. Since X is 
not equal to X , the true value, we must distinguish between the error and 
the residual which is defined by 

di = Xi - X, d 2 = X 2 - X, ■ ■ ■ d n = X n - X (13-71) 

It is assumed that the errors and residuals with which we are concerned 
are random ones. They are neither systematic nor constant but are equally 
likely to be positive or negative. Small errors are more frequent than large 
ones and very large errors do not occur at all. Under these conditions, 
the errors follow the laws of probability as given by the normal “ Gauss ” 
distribution (see sec. 12.3) 

e -*2/2*2 

w(x) = -J=r 

<rv 2t 

24 Errors of this kind are discussed in some detail by Crumpler, T. B. and Yoe, J.H., 
“ Chemical Computations and Errors,” John Wiley and Sons, New York, 1940. They 
may be detected in some cases by methods explained by Birge, R. T., Phys. Rev. 40 , 207 
(1932). 

25 Soe references at end of chapter. 

26 An exception is the case where the quantity is exact by definition. For example, 
the true value of the atomic weight of oxygen is 16.0000 to as many decimal places as 
may be needed. 



505 


ERRORS 


13.29 


It is convenient for our purposes to change the notation, writing w(x) = N 
and h 2 = 1/2<t 2 . The resulting equation 

(13-72) 

gives us the relative number of measurements N having an error x . The 
plot of N vs. x is called the Gauss error curve; it is shown in Fig. 1 for 
h = 1 and h = 0.6. From that curve or from eq. (72), we can discover 
the meaning of the constant h which is called the precision index . When 
it is large, N is large for a given small error x and decreases as x increases. 



Thus a high precision index means that a large number of the measurements 
agree closely with the true value of the quantity observed. On the other 
hand, if h is small, a smaller fraction of the results are close to the true value 
and more large errors occur than in the previous case. 

The probability that the error of a single measurement will lie between 
the limits -±a is 


h 

Vr 



2ft r a 

VtJo 




'dx 


This integral occurs so often in mathematical physics that it has been given 
the special name of the error function. It is usually denoted by 

erf (t) = ~~ f e^dy 


13.30 


NUMEBICAL CALCULATIONS 


5Ub 


Hence the probability in question is erf (ha). It cannot be evaluated in 
finite form but must be expanded in a power series and integrated term by 
term. Values of the integral as a function of t are found in all books on 
probability . 27 

The special case where the limits of integration are ± «> is of consider- 
able interest. The error must lie somewhere within this range, hence the 
probability must be unity. This is readily found to be true when the inte- 
gration is performed. 

The simplest way of evaluating the integral when the limits are ± <*> , 
is the following. Let 

j e-^ +1l) dudv 



Transforming to polar coordinates we get: dudv = rdrd<t>, u 2 + v 2 — r 2 , 


d<t> I re^dr — 7 r 
0 0 


Thus we see that the area under the whole curve (72) is unity. This, 
obviously, is the reason for the constant 2/ VV. 

13.30. Principle of Least Squares. — Suppose n measurements have 
been made, the i - th one having the error x*. The probability that Xi lies 
between Xi and X{ + dx{ is 


Pi = -jz.e-tettdxi 
v tr 


(13-73) 


The probability that the n errors X\, x 2 , • * •, x n occur is the product of n 
terms like (73), for each measurement is an independent event. Hence 
we have 

P = n Pi = 0=) e-*V 1 +*l+-+4)<feidx 2 ■■■dx n (13-74) 


Clearly the differentials dx 1 , dx 2 , • * * are arbitrary, for they may be inter- 
preted as the smallest subdivisions on a scale which is being read. Finally, 
remembering that h is fixed, we see that the probability P is a maximum 
when the exponent of e is a minimum ; thus we have 

£1 + *2 + * * * + Zn = a minimum (13-75) 

as the criterion for the most probable value obtainable from n equally 
reliable measurements of a quantity. This result is known as the Principle 
of Least Squares . 

27 See also “ Tables of the Probability Functions P(x) and Erf ( x ),” Works Prog- 
ress Administration, New York City, 1941. 



507 


ERRORS AND RESIDUALS 


13.31 


In accordance with that principle, let us determine the most probable 
value of the set of measurements Xi, X 2 , * • •, X n . Rewrite (75) in the form 

(Xi - X) 2 + (X 2 - X) 2 + * • • + (X n - X) 2 

differentiate with respect to X and equate the derivative to zero in order to 
obtain a minimum. Since the result is to be the most probable value of X, 
we replace X by the symbol X to indicate that X is chosen to satisfy eq. 
(75). The answer is 

I - - 1 ±. x * + (18 - w 

n 

As might be expected the most probable value is the arithmetic mean of all 
of the experimental results. It is interesting to note that the error law of 
eq. (72) is, within reasonable limits, the only form of equation which gives 
the average as the most probable value. 28 

13.31. Errors and Residuals. — If we add n errors, we find, since 
Xi = Xi : - X 

^X{ = nX + 

and from eq. (76) 

X - - 'EXi = X + - 2>i (13-77) 

n n 

Also, we obtain for the first residual 

di - Xi - J = X, - X - - 

n 

1 _ (ft — 1) 1 1 

= Xi = x\ x< 2 xz — - • • (13-78) 

n ft ft ft 

with similar equations for the others. We thus conclude that as n increases, 
the second term on the right of (77) becomes smaller and X approaches the 
true value X. In the same way, we conclude from (78) that as n increases, 
the residuals approach the true errors. Actually, if we square n equations 
like (78) and add them, we get 

Ed 2 , = !>?-- (2>i) 2 

ft 

so that the sum of the squares of the residuals is slightly less than the sum 
of the squares of the errors. 

Suppose two independent quantities (M x and M 2 ) have been measured 
and the errors in each case obey the normal law. Then the probability of 

28 A proof is given by Plummer, H. C., “ Probability and Frequency,” Macmillan 
Co., London, 1940, p. 123. 



13.31 


NUMERICAL CALCULATIONS 


508 


an error between x\ and x\ + dx\ in Mi is 

hi k 2j2 , 
pi = -)=. e^i^clx i 

V 7T 

while the probability of an error between x 2 and x 2 + dx 2 in M 2 is 

p 2 85 ^%e~’ 1 ^dx 2 

V 7T 

Since the observed quantities are independent of each other, the probability 
of the simultaneous occurrence of these errors in M\ and M 2 is 

V = P 1 P 2 

Now suppose that M\ and M 2 are combined linearly to form a quantity 

M — a\M\ + a 2 M 2 

where a\ and a 2 are constants. The error in M will lie between 

a 1 X 1 + a 2 x 2 = X (13-79) 

and 

ai(#i dx\) + a 2 (x 2 dx%) — x -j- dx 

We recognize the fact that such an error may be composed of any value of 
xi between d= co together with the corresponding value of x 2 fixed by eq. 
(79). Thus to compute the probability of an error x in ikf'we integrate 
V = P 1 V 2 with respect to Xi between the limits ± co and eliminate dx 2 by 
the relation dx — a 2 dx 2 which will be true when the integration has been 
performed over x\. Let us first rewrite p in terms of x\ and x which gives 


p =* C exp 




where C = h\h 2 /Tr. With the further abbreviation 

h\ht 


r = 


at\h 2 •+■ 


2t2 


we also have 


„ T 2 h\h%( airxVl J 

p = C exp ylx\dx 2 

Let N(x)dx be the required probability of an error in M between x and 
x + dx } then 

N(x)dx = Ce~^dx2 J exp£— J<&i 



509 


ERRORS AND RESIDUALS 


13.31 


Since atffai — dx we see that 


where 


or 


N(x) = — e~ m *' 
v 7 r 

Jr ^1^2 

oiih-2 + a 2 h i 

, 4 

H 2 ~ h\ + hi 


(13-80) 


Thus the error law for M is the same as the error law for M\ and M 2l the 
only difference being in the precision index. The equation is easily general- 
ized by the same method; in fact it may be shown that if 


M = I laiMi 


the precision index of M is given by 



(13-81) 


We would like to apply this result to the residuals. From (78), we may 
write 

di = - — ^ Xi Xj (13-78a) 

71 

where the prime on the summation sign means that the term i — j is 
omitted. The residuals are thus linear combinations of the errors, for d* 
corresponds to M in the preceding discussion and 

in - 1) 1 

«1 = ; a 2 — a 3 = ■■■ = a n 

n n 


The error law for the residuals is of the form of (72) or (80) 


& —HW 

V~* 


(13-80a) 


and from (81) since h is the precision index for each as,- 


1 1 f (» ~ l) 2 

H 2 h 2 L n 2 







w l(n ~ 1)9 


+ n - 1] 


or 


H - h 



(13-82) 



13.32 


NUMERICAL CALCULATIONS 


510 


From (82), it is seen that the precision index for the residuals depends on 
both n and h and is always larger than h. Reference to Fig. 1 shows that 
the curve of (80a) rises higher in the middle and falls off more rapidly than 
the curve of (72) but as the number of measurements increases the two 
graphs approach each other more closely. 

13.32. Measures of Precision. — Having obtained the most probable, 
value of a series of measurements, we need to find expressions for its reliar 
bility. In order to do this we must first consider the case where the true 
value X of the quantity is known. We may then proceed to the more 
practical question of expressing the uncertainty of X in terms of the residu- 
als. If the precision index were known it would be suitable for our measure 
of precision for as we have seen in sec. 13.29, erf ( hx ) is the probability that 
the error is within the range dzx. However, h has the dimension of 
a reciprocal error and it proves more convenient to use as a precision 
measure a quantity which is inversely proportional to h, thus having the 
same dimension as the error itself. Three such measures are commonly 
employed; they are the average error (a), the root mean square error ( m ) 
and the probable error (r). 

The average error is the arithmetic mean of all the errors without regard 
to sign 


n 


Xi 




From its definition (see sec. 12.3), it follows that 


a = J* \x | Ndx = 


Vtt^o hVx 


(13-84) 


Let us seek the most probable value of h. We recall that P of eq. (74) 
is the probability of the simultaneous occurrence of the errors x h x 2 , • - 
x n . Hence we must make P of that equation a maximum. Taking the 
logarithm of (74) we see that the most probable value of h is that quantity 
h ! which makes 

<f> = n log h — h 2 ^,x 2 i 


a maximum, or 




hence 



The quantity m defined by 


m 




(13-85) 



511 


MEASURES OF PRECISION 


13.32 


is called the root mean square error , Comparison of (84) with (85) shows 
that 



(13-86) 


The root mean square error is frequently used in mathematical statistics; 
there it is called the standard deviation and indicated by <r (see sec. 12.3, 
especially problem a). 

The probable error is defined as that error r such that one half of the 
errors of n observations are greater than r and one half are less than r. 
Thus it is given by the integral 

eri(hr) = ^ (13-87) 

for this says that there is an equal chance that a given error lies within =fcr 
or outside these limits. From tables of the integral, we obtain 


0.4769363 • • • 


(13-88) 


Combining this result with (85) we get for the probable error 

r = 0.6745 = 0.6745m (13-89) 

From eqs. (86) and (89) we can readily obtain all relations between 
a, m and r. They are 

r - 0.4769/T 1 = 0.6745m - 0.8453a 

m = 0.7071 /T 1 - 1.4826r = 1.2533a 

a = 0.5642/T 1 - 0.7979m - l.I829r 


The geometric significance of the three precision measures is also of interest. 
The average error a is the abscissa of the center of gravity of the area 
bounded by the error curve and the axes x and N of eq. (72). Tq see this, 
let x 0 be the center of gravity of that area, then 


Xq = 


/ 


xNdx 


s 


Ndx 


= — 7 = = a 

Av x 


which follows from (84) since J Ndx = 1. 


The root mean square error is the radius of gyration of the same area 
about the N axis ; it is also the abscissa of the point of inflection of the 



13.32 


NUMERICAL CALCULATIONS 


512 


error curve, as will now be shown. For the point of inflection, d 2 N /dx 2 = 0 
and from (72) 



Thus, 

or 


(1 - 2 h 2 x 2 ) = 0 



±m 


From the definition of r, it follows that the abscissa x = r corresponds 
to the ordinate which bisects the area of the error curve (72) between 0 
and oo* 



The relative sizes and positions of these three measures are shown in 
Fig. 2 where we draw only that half of (72) corresponding to positive 
values of x. It is perhaps not amiss to comment on the most appropriate 
measure to use. The average error recommends itself because of the ease 
with which it is computed. The probable error is less easy to calculate 29 

29 Convenient tables of 0.6745/ Vn as a function of n and other quantities useful in 
the calculation of errors may be found in “Handbook of Chemistry and Physics,” 
Chemical Rubber Publishing Co., Cleveland, Ohio. 



513 


PRECISION MEASURES AND RESIDUALS 


13.33 


but is perhaps more often used than the others in chemical and physical 
literature. As may be seen from Fig. 2 it is the smaller of the three and is 
thus more flattering than a or m to a set of experimental data. There is 
little choice between the three measures on theoretical grounds. 

It is often of importance to find some estimate of the probable error of 
an adopted precision measure itself. The result has been obtained by 
Gauss 30 who shows that the relative error of r is 

Q.4769 

V n 


With 10 measurements, it is seen that the probable error is uncertain by 
about 15 per cent while for even 500 measurements the uncertainty is 
2 per cent. It thus follows that it is seldom if ever of meaning to state 
the probable error with more than two significant figures, for usually one 
of these is uncertain. 

13.33. Precision Measures and Residuals. — From the equations of 
the previous section it is a simple matter to express the precision measures 
in terms of residuals. Suppose X i} X 2 , • • •, X n are n observations. If they 
follow the error law, the residual di is given by eq. (78a) and the index ol 
precision of the residuals by eq. (82). Therefore, the average error 


1 _ / n £1*1 _ Si dj | 


hVv yn(-l) n Vn(n - 1 ) 

Similarly, 

= j_ = m jjji , fM: 

m hV2 V (» - 1) n \ (n - 1) 

and 


r = 0.6745m = 0.6745 / _ ^ ■ 

\(» - 1 ) 


(13-83a) 


(13-85a) 

(13-89a) 


The differences between eqs. (83), (85), (89) and (83a), (85a), (89a) 
should be carefully noted. In many cases, the deviations are used in 
place of the errors to get a from (83) rather than from the correct 
eq. (83a). The difference is negligible, of course, in most cases. 

The most probable value or arithmetic mean also follows the error law. 
Its index of precision is obtained from (81) where ai = 1/n, hence 

I.I/r 2. I 

H 2 ~ h 2 ^ a<) ~ h 2 n 


30 A derivation of it is given by Plummer, loc. cit. 



13,34 


NUMERICAL CALCULATIONS 


514 


Thus if a, ra and r refer to the individual members of a set of n measure- 
ments, the corresponding precision measures relating to the arithmetic 
mean are 

, a tut m -r, r 

A = db ~7= , M = d= — = dz ”7= 

Vn Vn vn 

It will be observed that the precision varies as the square root of n. There- 
fore comparatively little is gained by increasing n, for in order to change 
the precision by one decimal point n must be multiplied by 100. This is in 
accord with common sense which suggests that instead of making 100 
measurements it is more economical and reasonable to seek an improvement 
in the experimental method. A graph of r versus n is shown in Fig. 3. It 
will be seen from that curve that it is seldom worthwhile to make more than 
10 measurements of a given quantity by the same method. 



13.34. Experiments of Unequal Weight. — It often happens that the 
results of one experimenter are more reliable than those of another. This 
may be due to superior method or apparatus, to greater experience with the 
operations involved or to other reasons. Moreover, because of particu- 
larly favorable conditions, the same investigator may obtain better results 
at some times than at others. In all such cases, more weight is attached to 
some of the data than to the remainder of them. For example, if one result 
Xi has a weight twice that of X 2 , then the average X = (2X\ + X 2 )/3. 
A result of weight w is thus equivalent to w results of unit weight, or we say 
that a result of large weight has a high precision index. 

If the j-th measurement is of weight Wj, the weighted average or most 
probable value is 


HwiXi 



515 


PROBABLE ERROR OP A FUNCTION 


13.36 


The probable error for the value of weight iVj is 

'4 

and for the weighted average 

P w = ±0.6745 


v., - ±0.6745 V <— ^ 


/ 

W (ii - 1): 


gg 

(n - 1 YD»i 


It is possible to determine the relative weights to be attached to the 
individual measurements since the weight Wj is inversely proportional to 
p 2 Wj . The usual custom is to assign weights arbitrarily. 

’ 13.36. Probable Error of a Function.— In general the results of several 
independently measured quantities are combined to give the final value of 
the physical constant desired. Suppose X, Y, • • • have been obtained as 
the average value of certain quantities with probable errors P.v, Py, • • •• 
If they are combined to give Z, where 

Z = • •) 


then its probable error is 

P = V(P*c)Z/0X) 2 + (PyoZ/SF) 2 + • • • 


We record a few special cases for convenience of reference. 

1. Z — X ±Y] P = ±VH + Py 

2. Z = XY; P = ±V(XP y ) 2 + (FP X ) 2 

3 . Z = X/Y) P = ^Y*Px + X*Pr 


4. Z — a + bX. Suppose we know the value Z\ with its probable 
error pi at the point X = Xi and Z 2 with error at X = X^. We wish to 
fit the two points to a linear equation. Then 


Pa = 

A = 
Pz = 


\ \A2 — Ai/ \A1 — A 2 / 

4(xr=~x) + (xr^;) 

l/ p.(X, - W , ( MX, - g y 
‘\\(X,-X l )J ^\(X 1 -X 2 )J 


where P a , Pb and P z are the probable errors in a, b and Z, respectively 



13.37 


NUMERICAL CALCULATIONS 


516 


13.38, Rejection of Observations. — Occasionally a single measurement 
from a set differs so widely from the others that the experimenter is tempted 
to discard it. A simple rule in such eases, based on statistical methods is 
the following: Calculate the average of all the data including the suspected 
measurement. Find each residual and calculate the probable error of a 
single determination. If any residual exceeds five times the probable error 
it may be rejected, the supposition being that the error cannot be a random 
one. The reason for the use of this rule is as follows. Suppose the proba- 
bility of an error as large as Xi in the quantity measured is 0.001, then the 
chance that an error as large as Xi will not occur is 0.999. Let us then 
determine the value of hx for which erf ( hx ) = 0.999. From tables of this 
integral we find 31 

hx = 2.326 


Now from eq. (88) we have 
thus 


hr = 0.4769 
x = 4.9r 


We conclude that the probability of an error 5 times as great as the prob- 
able error of a single measurement is less than 1 in 1000 hence the somewhat 
dogmatic rule for rejecting such measurements. 

13.37. Empirical Formulas.— As mentioned in sec. 13.1, there is con- 
siderable advantage in representing experimental data by means of equa- 
tions, the correct form of them being often suggested by theoretical con- 
siderations. In other cases, plots of various functions of the data may 
indicate a suitable form. When this question is settled, the next step is to 
determine the constants in the equation. Sometimes a graph may be used 
for this purpose, for if the equation is linear it is only necessary to deter- 
mine the slope and intercept of the curve. In more exact work, numerical 
methods are needed. 

a. The Method of Averages . Suppose that the quantity y has been 
observed as a function of another quantity x , the resulting numbers being 

2/2, • * •, 2 /n- It has been decided that a polynomial of the m-th degree, 
m < n is a suitable equation 

y = A + Bx + Cx 2 + • • * (13-90) 

Divide the measurements into groups equal in number to the unknown 
constants, placing an equal number of results in each group if this is 
possible. Add the equations in each group thus obtaining a set of simul- 
taneous equations equal in number to the number of unknowns. The 
equations may be solved by the methods of sec. 13.26. 

31 See, for example, the reference in footnote 29. 



517 


EMPIRICAL FORMULAS 


13.37 


It will be found in general that this procedure is quite satisfactory. 
The resulting constants are different for different groupings of the data, but 
the simplest such grouping is usually better than any other. If there are a 
large number of results or if the polynomial is of degree higher than four or 
five this method is nearly as good as the method of least squares and entails 
considerably less calculation. 

b. The Method of Least Squares . Suppose as before that n values are 
available for y but that the chosen equation is of a more general form than 
(90), 

V « f(p,A,B,C r • 0 (13-91) 


If there are n constants we may obviously fit the data exactly to such an 
equation but usually there will only be m < n constants. Thus the calcu- 
lated value of y will not agree with the observed one. Let 

Vi - Vi (calc.) + di 

where yi is an observed y and yi (calc.) is the corresponding calculated one 
using the constants finally adopted. In accordance with the principle of 
least squares we wish to make 

J^di = a minimum (13-92) 

Let us now assume that we have found approximate values of the con- 
stants by graphical means or otherwise so that 

A = J-o + a > B =* Bq + b] C = Co + c; - • * 
a f b, c, • • • being small correction terms. Then the £~th equation of (91) 
fi(A,B,C > • •) = yi - di 


may be written as 

fMo,Bo,C 0 ,- • •) + c ^ + b || + c || + ' • ' = V* ~ d < < 13 “ 93 ) 

where we have discarded derivatives of second and higher order. Using 
the abbreviations 


dfi __ . dfj = 

aAo dB 0 


BU 

BC o 


= m\ 


and 

Vi ~ fi(Ao,Bo,Cot' • •) — Ft 


(93) becomes 

u t <i + ti»i) H- + • • • — F i -f- di — 0 



13.37 


NUMERICAL CALCULATIONS 


518 


where «,•, v»-, Wi, Fi are known and a, b, c, d% are unknown. Since we wish 
(92) to hold we must require that 

I l(uia + Vib + w { c+ • •• ~ Fi ) 2 = <t>(a,b,c) 

i-l 

be a minimum or that 


~ = 2 2^(uia + Vib + WiC H — Fi)ui = 0 

da 

~ = 2^2 (u { a + vfi + wtc + • • • — Fi)vi = 0 
do 

— = 2 J2( u i a + Vib + W{C + • • • — Fi)wi — 0 
dc 


(13-94) 


These equations (when divided by two) are called the normal equations. 
There will be as many of them as there are unknowns. 

In* many cases, the chosen relation between x and y is a polynomial, 
when some simplification in the procedure is possible. The original equa- 
tions corresponding to (91) will be of the form 

A + + Cx^ + * • * — y% (13—95) 

It is still worthwhile to use approximate values of the constants for then the 
normal equations will be easier to handle. If this is done (95) becomes 

a + bxi + cXi + * • • = Fi (13-95a) 

In either case, the normal equations may be written down without differen- 
tiation. They are found as follows: (1) multiply each equation of (95) 
or (95a) by the coefficient of the first unknown (unity since we are speaking 
of A or a ) and add the resulting n equations; (2) multiply each equation 
by the coefficient of the next unknown (xi) and add these equations; (3) 
continue in the same way until each equation has been multiplied by the 
coefficient of each unknown. The resulting normal equations which are 
identical with those obtained by the procedure leading to eq. (94) may then 
be solved by the methods of sec. 13.26 to obtain the constants. The final 
equation should always be checked by using it to compute each known ?/*. 
The sum of the squares of the residuals should be small and the algebraic 
sum of the residuals themselves should be nearly zero. 32 

Such a procedure will show how closely the curve fits the known points 
but says nothing about the reliability of the curve at other places. In the 

32 Further details of the method of least squares are given by Brunt, D., “ The 
Combination of Observations,” Cambridge Press, 1917. He describes several schemes 
for checking the calculations and evaluating the constants with their probable errors. 
See also, Birge, R. T., Revs. Mod. Phys. 19, 298 (1947). 



519 


REFERENCES 


important case of a linear equation, y — a + bx the formulas 33 are com- 
paratively simple. The probable errors in a and b are 


Pa - r e 



Pb = r, 



r. 


= 0.6745 


(n ~ 2) ' 


D = - (2><) 2 


The error in y at any point x ( x not necessarily a measured value) is 


33 See Birge, loc. cit. 


Ps = r e 



fa - xf 

D 


REFERENCES 


Numerical calculations of various kinds are discussed in: 

Allen, D. N. deG., “ Relaxation Methods,” McGraw-Hill Book Co., Inc., New York, 
1954. 

Collatz, L., “Eigenwert Probleme und Ihrc Numerische Behandlung,” Chelsea Publish- 
ing Co., New York, 1945. 

Dwyer, P. S., “ Linear Computations,” John Wiley and Sons, Inc,, New York, 1951, 

Householder, A, S., “ Principles of Numerical Analysis,” McGraw-Hill Book Co., Inc., 
New York, 1953. 

Milne, W. E., “ Numerical Calculus,” Princeton University Press, Princeton, 1949. 

Milne, W. E., “ Numerical Solution of Differential Equations,” John Wiley and Sons, 
Inc., New York, 1953. 

Scarborough, J. B., “Numerical Mathematical Analysis,” Second Edition, The Johns 
Hopkins Press, Baltimore, 1950. 

Shaw, F. S., “ An Introduction to Relaxation Methods,” Dover Publications, Inc., 
New York, 1953. 

Whittaker, E. T,, and Robinson, G., “ The Calculus of Observations,” Second Edition, 
D. Van Nostrand Co., Inc., New York, 1930/ 

Willers, F. A., “ Practical Analysis. Graphical and Numerical Methods,” translated 
by R. T. Beyer, Dover Publications, Inc., New York, 1948. 

Probability, the theory of errors and related subjects are treated in: 

Arley, N. and Buch, K. R., “ Introduction to Mathematical Probability,” John Wiley 
and Sons, Inc., New York, 1949. 

Beers, Y., “ Theory of Errors,” Addison-Wesley Publishing Co., Cambridge, 1953. 

Deming, W. E., “ Statistical Adjustment of Data,” John Wiley and Sons, Inc., New 
York, 1943. 

Jeffreys, H,, “Theory of Probability,” Second Edition, Oxford University Press, New 
York, 1948. 

Kolmogorov, A., “ Foundations Of the Theory of Probability,” Chelsea Publishing Co., 
New York, 1950. 

Uspensky, J. V., “ Introduction to Mathematical Probability,” McGraw-Hill Book Co., 
Inc., New York, 1937. 

Wilson, E. B., Jr., “An Introduction to Scientific Research,” McGraw-Hill Book Co., 
New York, 1952. 

Youden, W. J., “ Statistical Methods for Chemists,” John Wiley and Sons, Inc., New 
York, 1951 



CHAPTER 14 

LINEAR INTEGRAL EQUATIONS 


14.1. Definitions and Terminology. — An integral equation is one which 
contains the unknown function behind the integral sign. Its importance 
for physical problems lies in the fact that most differential equations together 
with their boundary conditions may be reformulated to give a single integral 
equation . If the latter can be solved, the mathematical difficulties are 
not appreciably greater even when the number of independent variables 
is increased, while differential equations, such as Laplace’s, are considerably 
more complex in three dimensions than in two. The theory of integral 
equations also furnishes a uniform method for the study of the eigenvalue 
problems of mathematical physics. 

A linear integral equation of the third kind , the most general type con- 
sidered, has the form 

g(x)<j>(x) « f(x) + X J* K(x,z)<p(z)dz (14-1) 

The known functions are g(x ), f(x) and K(x } z), the latter being called the 
kernel or nucleus . The limits of integration a and b are either known func- 
tions of x or constants; X is an absolute constant or a parameter. It is 
desired to find the unknown <f> as a function of the independent variable x. 

Four special cases of (1) have been most widely studied. In Fredholm’s 
equation of the first kind , g(x) = 0, and in his equation of the second kind , 
g(x) = 1; in both cases a and 6 are constants. Volterra’s equations of the 
first and second kind are like Fredholm’s equations except that a — 0, and 
b = x. If f(x) = 0 in either case, the equation is said to be homogeneous , 
When one or both limits become infinite or when the kernel becomes infinite 
at one or more points within the range a to 6, the equation is called singular . 

Non-linear integral equations may occur in the form 

= f(x) + \f K(x,z)4> n (z)dz 

or 

<t>(x) = f(x) + ^f F[x,z,<t>(z)]dz 
520 



521 


THE LIOUVILLE-NEUMANN SERIES 


14.2 


We limit 1 our discussion here to linear equations in one variable where the 
unknown <p enters only to the first power. Our plan is to present first the 
purely formal mathematical methods of solution. We then show how to 
convert differential equations into integral equations and apply the theory 
to certain physical problems. 

GENERAL METHODS OF SOLVING INTEGRAL EQUATIONS 

14.2. The Liouviile -N eumann Series. — a. Fredholm's Equation of the 
Second Kind . Suppose the given integral equation is 

*(*) = m + \f*K(x,z)<f,(z)dz (14-2) 

where x and z are real variables with a < x < b, a < z < b) K(x,z) and 
f(x) are continuous but may be complex. We attempt to solve (2) by 
means of a power series in X: 

*(*) - £ X n *n(aO (14-3) 

n =0 

Substituting (3) into (2) and equating coefficients of equal powers of X we 
obtain 

*o(*) = /(*) 

<fo{x) = f K(x,z)<j>o(z)dz 

*»(*) = f K(x,z)4n(z)dz (14-1) 


4>n(x) = J K(x,z)<j>„-i(z)dz 

Remembering that both x and z are restricted to lie between a and b, we see 
that the kernel and f(x) must have maximum values, for we assumed them 
to be continuous. Let these maxima be given by | K(x,z) | < M, | f(x ) [ 
< N. Then it follows that 

\<t>o\<N, l*i I <HM(b- «),•••, 1 | < N[M (6 - o)]» 

1 References to more complete accounts of the subject will be found at the end of this 
chapter. Integral equations are frequently encountered in current physical and chemi- 
cal literature, indicating that they are powerful tools for handling a variety of problems. 
Many examples of such usage are given by Morse, P. M., and Feehbach, H., u Methods 
of Theoretical Physics,” McGraw-Hill Book Co., Inc., New York, 1953. 



14.2 


LINEAR INTEGRAL EQUATIONS 


522 


If 


l*l< 


1 

M(b - a) 


(14-5) 


the series (3) which is called the Liouville-N eumann series converges uni- 
formly and is the unique continuous 2 solution of (2) within the range 
a < x < b. 

In order to obtain the solution in more convenient form, we define the 
iterated kernels : 3 


Ki(x,z) - K(x,z) 

K 2 (x,z) = J K(x,y)K{y,z)dy 


(14-6) 


K n (x,z) = J K{x,y)K n -\{y,z)dy 

J K{x,yi)K{yi,y 2 ) • • • K{y n _ u z)dyydy 2 ■ • • 

Introducing these functions into (4) we may write 

C 

01 (x) = J K(x,z)f(z)dz 

02 (x) = J K 2 (x,z)f(z)dz (14-7) 



0n(#) — ^ K n (x y z)f{z)dz 

By the same means as before we see that j K n (x 1 z) | < M n (b — a)*” 1 ; 
hence if (5) is fulfilled we can construct a uniformly convergent series 
called the resolvent (losender Kern ) . 


X(x,z;\) = L X^^x,;?) (14-8) 

n=0 

From (3), (6) and (8), it follows that the solution of the integral equation 
is 

<f>(x) = f(x) + \J K(x,z'X)f(z)dz (14-9) 

2 Continuous solutions of the equation may exist even if (5) is not true. There may 
also be discontinuous solutions. For these exceptions, see Lovitt, loc. dt., pp. 13 and 21. 

3 Henceforth, we usually omit limits of integration unless they are different from 
a and b. 



S23 


THE LIOUVILLE-NEUMANN SERIES 


14.2 


The resolvent and <f>(x) have properties of a reciprocal nature as may 
be seen by comparing (2) and (9). If <j>(x) is the unknown, (9) is the solu- 
tion; if f{x) in (9) is the unknown, (2) is the solution. These properties 
are even more apparent if we rewrite (8) in the form 

i£(x,z;X) - K(x,z) = \£ \ n K n+2 (x,z) = X" f K(x,y)K n+ i(y,z)dy 

n=G tj. =0 


or 


K(x,zik) - K(x,z) = K (x,y)K(y,z;\)dy (14-10) 

Similarly, we may obtain 

K(x,z;\) - K(x,z) = X J K{x,yX)K{y,z)dy 

b. Volterra’s Equation of the Second Kind . Application of the Liouville- 
Neumann series may also be made in this case. Suppose 


*(*) - /(*) +X ( H (x,z)<f>(z)dz (14-11) 


is given. Then if 


k (*,*)} : ^ x>z); 0 - * | 

we may write an equation similar to (7) for z < x 
4>n(x) = J K n (x,z)f(z)dz 
and also an equation like (6) 

K„(z,z) = f K(x,y)Kn_i{y,z)dy 


= J K(x,yi)dyi J K{yi,y 2 )K„_ 2 {y 2 ,z)dy 2 

The solution of Volterra's equation obtained in this way converges for all 
values of X. 


c. Volterra’s Equation of the First Kind . Under certain conditions, 
Volterra’s equation of the first kind may also be solved by the Liouville- 
Neumann series. With a change of notation, we write this equation as 


gfr) 




K(x,z)4>(z)dz 


(14-12) 



14.2 


LINEAR INTEGRAL EQUATIONS 


524-s 


Differentiation with respect to x results in 

X X 

— 4>(z)dz + \K{x,x)<t>(x) 
dx 


which is similar to (11) provided K{x,x) 5^ 0 and 


/(*) = 


g'Qc) . 

\K(x,x) ’ 


dK 


H(x,z ) = 


dx 

\K {x,x) 


A similar conversion of (12) to an equation of the second kind may be made 
by partial integration. 

When K(x,x) vanishes, the procedure just described gives an equation 
of the first kind again. Let us consider the situation in more detail, 
assuming that the kernel is a polynomial of rc-th degree in x and that the 
coefficients of the terms in x are polynomials in 2, but not necessarily of 
?i-th degree. It is convenient to express the kernel as a polynomial in 
£ = (x — 2), so that it may be written 

K(x,z) = a 0 (z) + ai(*)£ + h a n % n 

Two special cases are of interest: (1) a 0 (z) = 0; (2) a^{z) contains no 
constant term. 

In the first case, K(x,x) vanishes identically; but if the derivative of 
the kernel does not vanish, which means that a\ ^ 0, two differentiations 
of eq. (12) will yield an equation of the second kind and a solution is again 
possible by this method. Further differentiations could be carried out if 
necessary. Several partial integrations could replace the differentiations, 
if this were preferred. 

In the second case, the kernel vanishes only for x = z - 0. However, 
the integral equation may then be converted into a differential equation. 
With the same polynomial kernel, differentiate eq. (12) ( n + l)-times. 
The integral on the right will vanish, the differential equation remaining 
is of order n } and its solution, adjusted to fit the Appropriate boundary 
conditions, is the solution of the integral equation. 

An explicit form for the solution can be given, but it is quite awkward in 
the general case. Note also that the presence or absence of. a constant 
term in a Q (z) is of no consequence. For illustrative purposes, let us take 
a simpler expression for the kernel. Suppose the polynomial is only of 
second degree and that a^(z) =» Aq + A x z + A 2 z 2 ; a x (z) ~ B 0 + B\Z ; 
a 2 (z) = C 0 , where At, C t are constants. Three differentiations of (12) 
will give 

g fn (x) = A 2 x 2 <t> ,f (x) + (B x + 4A 2 )x<f> ; (x) + (2A2 + B x + 2 Co)<t>(x) 
which is a differential equation of the Euler type. Introduction of a new 



525 


THE LIOUVILLE-NEUMANN SERIES 


14.2 


variable, u = In x, will reduce it to a linear, inhomogeneous equation with 
constant coefficients, as discussed in sec. 2.8. Its form, with proper change 
in notation, is identical with eq. (2—19), and its solution is eq. (2-22). 

An important case arises when the kernel becomes infinite at one or more 
points within the range of x and z. It is then necessary to transform the 
equation to remove the singularity. As a typical example, consider the 
kernel 

£(*.*) = ^ ; o < a < 1 


which is infinite when x — z. Substitute this kernel in (12), multiply both 
sides of the equation by dx/(u — a;) l “ a and integrate with respect to x 
from 0 to u. If for simplicity we also take A = 1 the result is 

J r u g(x)dx r u dx P x <j> [z)dz 

o (u — x) l ~ a Jo (u — x) l ~~ a Jo (x — z) a 


J pu pu 

• <t>(z)dz / - 

0 Jz I 


dx 


z)« 


(• a — x) 1 a (x 

The justification of the change of limits and order of integration in the last 
equation is the following. Since x varies from 0 to u> and for every value of 
x , the variable z goes from 0 to x } the situation is equivalent to the varia- 
tion of z from 0 to u and the variation of x from z to u for every value of z. 
The same result is also easily obtained from a figure. If we are integrating 
F(x f z) over the shaded area of Fig. 1 we see that 

X u px pu pu 

dx I F(x,z)dz = I dz l F{x ) z)dx 
Jo Jo Jz 


X U pu 

F(x,z)dz — J (u — x) a ~ l (x — zY~ a dx may be 

evaluated as follows. Introduce the new variable y = (u — x)/(u — z) 
which shifts the limits to 0 and 1, respectively. The result is an Eulerian 
integral of the first kind 4 or 5-function which is simply related to the 
F-f unction. Explicitly, the result using (3-12) is 

B(ctj 1 — a) — r(a)r(l - a) = 7r/sin air. 

The solution of the integral equation is thus 

UX s(xX “ _ 

Equations with singular kernel, especially those where the singularity 
results from an infinite limit of integration, may usually be solved by 
integral transforms. In fact, the transforms of Fourier, Laplace, Hankel, 


See sec. 3.2. 



14.3 


LINEAR INTEGRAL EQUATIONS 


526 


and Mellin are special cases of integral equations of the first kind. They 
have been discussed at length by Morse and Feshbach, loc. cit. 

Problem. Solve the equation 4>(x) = x -f J (z — x)<t>(z)dz by the Liouville- 

Jo 

Neumann series. Hint: substitute (z — x) = u; (y — x) ~ v. Am <f>(x) = sin x. 



14.3 Fredholm’s Method of Solution. — a. The Inhomogeneous Equar 
lion. Fredholm studied the solution of a system of linear equations in 
n variables and observed that as n becomes infinite the results are appli- 
cable to linear integral equations. Although the reasoning is simple, the 
derivation of the final formulas requires considerable space. We therefore 
show only how the method may be Used, referring the reader to other 
sources 5 for the intermediate steps and proofs. 

The unique and continuous solution of (2) is of the form (9), where the 
resolvent is the ratio of two infinite series in X. In fact 


where 


K(x } z;\) 


D(\) 


Z)(x, 2 ;\) = K(x,z) + £ D n (x,z)\» 

n-1 nl 

* r-ir 

D(\) - X p-T> n X n 

n=0 nl 


(1.4-13) 


(14-14) 

(14-15) 


The coefficients D n and the functions D n (x,z) may be found from the 
following recurrence relations. Starting with K(x,z) = D 0 (x,z) we obtain 

5 See references at end of the chapter. 



527 


FREDHOLM S METHOD OF SOLUTION 


14.3 


D\ from the integral 

Dm — j* D^iix^dx v (14-16) 

We then find D x {x y z) from 

D m (x y z) = K(XjZ)D m - mj Kix^D^iy^dy (14-17) 

which enables us to determine D 2 from (16). Continuing in this way, all 
of the coefficients are calculated. In many cases, depending on the explicit 
form of the kernel, the series (14) and (15) contain only a finite number of 
terms. 

One distinct advantage of the Fredholm method is that (13) is uniformly 
convergent for all values of X unless D(\) = 0. If that happens, the 
procedure which we have described is inapplicable since the resolvent 
vanishes. Actually, there is then no solution unless certain other condi- 
tions are met. We omit the necessary extension of the Fredholm theory 
but return to the problem in sec. 14,4b. 

b. The Homogeneous Equation. If f(x) = 0, so that the given equation 
is homogeneous, 

<t>(x) = \J K{x,z)<t>{z)dz (14-18) 

Then cursory inspection of the solution (9) leads to the conclusion that 
<p(x) = 0. This is generally the case but we shall see that when the pa- 
rameter X assumes certain special values we are led to a situation similar 
to the eigenvalue and eigenfunction problem described in Chapter 8. If 
D(X) = 0 and D(x f z ;X) ^ 0, eq. (13) indicates that K(x y z;\) approaches 
infinity and we may still find non-vanishing solutions of (18). Equating 
the right side of (15) to zero, we have a polynomial in X with n roots, multi- 
ple or distinct. They are the eigenvalues of the kernel, and the correspond- 
ing solutions of (18) are the eigenfunctions. Assuming that all eigenvalues 
are distinct, choose one of them, say X;, substitute (13) in (10) and multiply 
by which gives 

D(x,z;X«) = Xi f K(x,y)D(y,zfc)dy (14-19) 

If we compare this equation with (18), we observe that Z)(x,z;X,), for any 
constant z, is a solution of the homogeneous equation, i.e., 

<t>i(x) ~ D(x,c;\i) (14-20) 

Having found a solution for X,-, we proceed to find the others for the re- 
maining eigenvalues in the same way. Linear comoinations of them form 



14.4 


LINEAR INTEGRAL EQUATIONS 


528 


the general solution 

$( 2 ) — 22 (x ) ( 14 — 21 ) 

in =1 

where the C m are arbitrary constants. 

It is true that D(z,z;\i) may vanish identically in x and z or vanish 
because of an unfortunate choice of the constant value of z . In the former 
case, non-trivial solutions may often be found by more complicated meth- 
ods; in the latter case, we simply choose another z c. When the eigen- 
values are degenerate further modifications of the method are required. 


Problem a. Solve by the Fredholm method: 

<t>{x) = x + \ J* (x 4- z)4>(z)dz 


Ans.: 


Problem b. 


4(») 

/ 


6s(X - 2) - 4X 
X 2 + 12X - 12 

dD(\) 


Show that 1 D(x,x; \)dx = — 


d\ 


Hint : use eq. (16). 


Problem c. Set f(x) =0 in the equation of Problem a and solve. Hint: 
that D{x,c\ X) = (2/e) (2 - t)(tx + 1 )(eo + 1); X = 2e(2 - e); e= ± Vs. 
Arts.: $±(x) - C._ t(l ± V3z). 


show 


14.4. The Schmidt-Hilbert Method of Solution. — In many physical 
problems, the kernel has the property of being symmetric , i.e., K(x } z) 
= K (z,x). In such cases, 6 the integral equation may be solved by a method 
which is somewhat different from any of those in the preceding sections. 
We find it convenient to limit the discussion to kernels which are real as 
well as symmetric. 

a. The Homogeneous Equation . A real symmetric kernel has at least one 
eigenvalue and it may have an infinite number. We omit the proof of 
these facts. 

The eigenfunctions of the homogeneous equation (18) are mutually orthog - 
onal. Suppose X* and Xy are two different eigenvalues corresponding 
respectively to eigenfunctions 4>i and ty. Then we may write 

<t>i(x) = X t - j* K(x,z)<f>i(z)dz 
<t>j(x) = Xj J % K(x } z)(/)j(z)dz 

6 Uneymmetric kernels may often be symmetrized; see sec. 14.7 or Courant-Hilbert, 

IOC Clt. ' 



529 


THE SCHMIDT— HILBERT METHOD OF SOLUTION 


14.4 


Multiply the first equation by 0y and the second by 0 t -, then integrate over x 


J <t>i(x)(f>j(x)dx = \i J K(x,z)<f>i{z)<t)j(x)dzdx 

= Ay J'K(x,z)<t>i(x)<t)j(z)dzdx (14-22) 


The last integral may be written as Ay J K(z,x)(j>i(z)<t)j(x)dzdx by inter- 
changing x and z. Thus if K(x,z) = K(z,x), the two integrals of (22) are 
identical and since A; ^ Ay, it follows that 

J*<t>i(x)<t>j(x)dx — 0 (14-23) 

As we know from Chapter 8 such functions may always be normalized. 
Henceforth, we will assume that this has been done and will indicate the 
orthonormal solutions of (18) by $;(£), so that 

J ' $i(x)$j(x)dx = dij (14-24) 

The eigenvalues of a real, symmetric kernel are all real . Suppose the 
solution of the homogeneous equation (18) were of the form <t>(x) = 4>i(x) 
+ itfoOr) and one of its eigenvalues were also complex, A = a + i/3. We 
could then take the complex conjugate of (18) 

= X*/ K(x,z)<t>*(z)dz 

But according to (23) 

(A — A*) J^<l)(x)(f> :¥ (x)dx = 0 
or 

2i/3 j* (0i -f- = 0 

which means that (3 — 0 and the eigenvalues must all be real. 

Arbitrary functions of x, including the kernel for fixed z, .may be ex- 
panded in terms of the eigenfunctions 

K(x,z) =1 ;CA(a) (14-25) 

The functions $i(z) form a complete set as explained in Chapter 8. As 
also shown there, the coefficients of (25) may be found by integrating that 
equation term by term. Thus, using (24), 




14.4 


LINEAR INTEGRAL EQUATIONS 


530 


But 


$ t -(z) = \i J K(z,x)$i(x)dx = X, J' K(x,z)$i(x)dx 


since the kernel is symmetric, hence 


and (25) becomes 



K(x,z) = L 

i 


X t - 


(14-26) 


b. Solution of the Inhomogeneous Equation. We are now ready to con- 
sider the inhomogeneous equation (2) ; for that purpose we assume that 
we have found the eigenfunctions of the homogeneous equation by the 
method of sec. 14.3b. Let them be $»(a:). Then we may write 


<t>{x) -/( x) = L cti<t>i(x) 

<*i = J t <t>(x) - f{x)]^i(x)dx 


(14-27) 


where 4>(x) and f(x) both come from (2). Now substitute (27) in (2) to 
give 


L oti$i(x) = X J' K(x,z)f(z)dz + XL ai J K(x,z)<t>i(z)dz (14-28) 


We may also expand /( x) : 

/W =I(iA(i); &• = f f(x)<t>i(x)dx (14-29) 

and obtain by using (26) and (24) 

f K(x,z)X 0M*)dz = f I i3&(z)dz 

x/ xJ i hi j 


with a similar expression for the last integral of (28). That equation 
becomes 

22 ai <t>i(x) = XZ ~ 4>,(x) + XL ~ *i(x) (14-30) 

A{ Ai 

Because of the independence of the functions $ t , the coefficients of each 
may be equated on both sides of this equation. Hence, 

& . oti 



531 


THE SCHMIDT— HILBEBT METHOD OF SOLUTION 


14.4 


or if X * A,-, 

<*» = ft — (14-31) 

h{ — A. 

This method, which was devised by Schmidt and Hilbert, thus gives a solu- 
tion for X X*-, for we may substitute (29) and (31) into (27) and obtain 

*(*) = /(aO + AE-^^fe) 

i A i'. A 

= m + XZ /(«)*<(«)*] (14-32) 

As we have noted before, the homogeneous equation for X ^ X t - has the 
solution ss 0 since /(x) = 0. 

We must still consider the exceptional case when X is one of the eigen- 
values of the kernel. Suppose, for example, that X = X 0 is an w^-fold 
degenerate eigenvalue, i.e., X 0 = X x , X 2 , • • *, X m . Then (2) reads 

4>(x) = f(x) + X 0 J K(x,z)cf>(z)dz 

and by the preceding method we obtain 

ftr ' X Q 

“ ~ x t - - Xo 

where i is not one of the numbers 1, 2, • *, m. When i equals one of these 
integers we have, if is to remain finite, 

Pi = #2 = * * * = 0m = 0 

which in turn requires that 

ft = J f(x)$j(x)dx = 0; j = 1,2, ■ • m (14-33) 

Thus if Xo is an m-fold degenerate eigenvalue, the inhomogeneous equation 
has solutions only if f(x) is orthogonal to the corresponding eigenfunc- 
tions $j(x). The general solution of the equation is then 

*(*) = m + xoZ' f /(»)*«(«)*] + ct*t (*) + ••• 

+ C m $ m (x) (14-34) 

where the prime on the summation sign means that the terms i 1, 2, • • *, m 
are to be omitted from the sum. 

Problem. Find the solution of the equation of Problem a, sec. 14.3, by the Hilbert- 
Schmidt method for X not equal to an eigenvalue. Show that there are no solutions 
when X is an eigenvalue. 



14.6 


LINEAR INTEGRAL EQUATIONS 


532 


14.6. Summary of Methods of Solution. — a. The Homogeneous Equa- 
tion. 

1. D(\) 9 ^ 0. No solution except <t>(x) = 0. 

2. D(\) = 0; D(x } z]\) 0. Solution is given by (20) and (21). 

The resulting eigenfunctions are orthogonal and may be normalized. To 
each solution belongs an eigenvalue. 

b. The N on-homogeneous Equation . 

1. Solution given by (9) provided (5) holds. 

2. For all values of X ^ X* solution is given by (9) and (13). 

3. If j K(t,z) = Kiz.x), solutionis (32). 

X = \ 

4. K(x,z) = K(z,x); solution is (34), Special methods have been 
given for Volterra’s equations of the first and second kinds. 


USE OF INTEGRAL EQUATIONS 


14.6. Relation between Differential and Integral Equations. — We have 
shown in the previous sections how integral equations of the more common 
types may be solved. We now propose to study the relation between 
differential and integral equations so that we may state physical problems 
in either form at will. For this purpose consider as a simple example the 
second order differential equation 

V" = (14-35) 

Integration results in 


v\x) = )J { *,y(z)}<k + Ci 

r x T r x 1 (14-36) 

y(x) = J o [ Jj{ z >y( z )}dzjdx + Cix + c 2 


An alternative form of the last expression 7 is 


y(x) = f (x - z)f{z,y(z)\dz + g(x) 
Jo 

g(x) = C\X + C 2 


(14-37) 


which is recognized as a non-linear Volterra equation of the second kind 
with y(x) as the unknown. 

7 To show that the two equations for y(x) are identical, differentiate the last one 
with respect to x; the result is (36). 



533 RELATION BETWEEN DIFFERENTIAL AND INTEGRAL EQUATIONS 14.8 


The boundary conditions which are needed to determine the two 
integration constants, C\ and C 2 , may be either of two types: (a) y and y* 
are fixed at one point within the range of integration, say at x = 0; (b) y is 
fixed at two points. The first case is simple, for if y(0) = a, y f { 0) =* fe, 
(37) becomes 

y(x) = I (x - z)f{z } y(z)\dz + bx + a 

J Q 


The second case leads to greater difficulties. Suppose y( 0) = a,y( 1) — b; 
then C 2 = a, as before. For x = 1, we have 

* — 2/(1) — 1 (1 — z)fdz + Ct + a 

** 0 
or 

Ci = (b - a) - f (1 - z)fdz 
0 

where we abbreviate f{z,y{z)) by the single symbol /. Substituting the 
values of C\ and C 2 into (37) we obtain 

y(x) = h(x) + f (x - z)fdz + x f (z - 1 )fdz 

^0 •'o 

- h(x) + f (x — z)fdz + x f (z — 1 )fdz +x f (z — l)/dz 
= A(*) + f ’e(z-l)fdz + J 1 z(z - l)/& (14-38) 


where h(x) - o + (b — <z)x. We thus see that in this case, if we are 
willing to divide the range of x into two parts with a different kernel for 
each part, 


K(zp) 


= z(x — 1) X > z 
= x(z — 1) X < z 


eq. (38) becomes an integral equation of the Fredholm type 

y(x) = h(x) + f K (x,z)f{ z,y {z)\dz 
0 0 

Problem. Convert the following differential equation and its boundary conditions 
to an integral equation. 


y" + y - 0; y(fl) = y"(0) - 0; y'(0) - 1 


Am-; y(z) * * + r (* - aOy(z)<fe 



14.7 


LINEAR INTEGRAL EQUATIONS 


534 


14.7. Green’s Function. — Our problem now is' to find a general method 
of constructing such kernels. For this purpose we consider the inhomo- 
geneous Sturm-Liouville equation 

L(u ) = (pu / ) / — qu = — 0(a?) (14-39) 

the homogeneous form of which has been discussed in Chapter 8. We 
will later prove that a certain function G{x,z) called Green’ s function is the 
kernel of a homogeneous integral equation which is equivalent to (39) 
and its boundary conditions. At the moment we study the means of 
finding Green’s function. For reasons which will presently be clear, it is 
defined to have the following properties: 

a. For fixed z, it is a continuous function of x and satisfies all of the 
boundary conditions to be imposed on u. 

b. Both G 1 and G n are continuous at eve^ point within the range 
of x except at x = z, where it is discontinuous 8 so that 

G'(z + 0) - G'(z - 0) - -1/pOO (14-40) 

c. Except at x = z, G(x,z) satisfies the differential equation L(G) = 0. 
We now proceed to find such a function G. Suppose two linearly inde- 
pendent solutions of 

L(u) = 0 (14-41) 

are known. If these are ui(x) and u 2 {x) their independence may be 
recognized by the fact that the Wronskian, U\vh — u[u 2 0 (see sec. 
3.13), and the general solution of (41) is 


u(x) — C\Ui + C 2 u 2 


Let us divide the range of x. into two portions ; a < x < z, z < x < b, 
and write 


f uj = (A — oO^a;) + (B — @)u 2 (x); x < z 
\uu = (A + ot)ux{x) + (B + p)u 2 (x); x > z 


(14-42) 


where A, a, B y fi are constants to be so chosen that u 7 which will later be 
taken as our Green function, satisfies conditions a, b and c. If we im- 
pose on this function the requirements a and b, we must have 


mjCO = uu(z) 
u'n(z) - ui(z) = -1/p CO 


(14-43) 


8 The notation G'(z 4-0) means that G r is evaluated at the discontinuity when it 
is approached from values of x > z while G\z — 0) is evaluated when the discontinuity 
is approached in the opposite direction. It is necessary to make this distinction in 
order that the magnitude of the discontinuity will be determined with respect to sign. 



535 


green’s function 


14,7 


or, because of (42) 

auiiz) + fSu 2 (z) = 0 

ctu[(z) + 0u' 2 (z) = -1/2 p(z) 


Solving these equations for a and 0, we obtain 

1 Uo 1 U\ 

& n t 7 ) q 7 7 

2p Uilto — U\lt2 2p UiU'2 ~ U\U<2 

and hence 

u(x) = f(x,z) + Aut(x) + Bu 2 (x) (14-44) 

where 

f ( \ = . 1 f - u 2 (g)u x (a:)" [ 

nX ’ Z) * 2 p(z) W 2 (z)Mz) - u[(z)u 2 (z) J 

Here and in the remainder of this chapter, when two equations are given 
or when there is a choice of sign, the first always refers to x < z and the 
second to x > z. The two constants A and B of (44) are determined so 
that u(x) satisfies the boundary conditions of the problem. The resulting 
function, which we henceforth indicate by G(x,z), is Green’s function , 

We now prove that if 0(x) is a continuous function of x f then the func- 
tion which will satisfy the differential equation (39) is given by 

u(x) - C* G(;x,z)<t>(z)dz ( 14 - 45 ) 


Differentiation of (45) with respect to x gives 


Therefore 

pu l 


u\x) = f' G' (x,z)<t>(z)dz 

u"(x) = f G" {x,z)4>(z)dz + f G" (x,z)<t>(z)dz 
+ G'(x, x — 0)0 (x) — G'(x,x + 0)0 (x) 

= J G"(x,z)<j>(z)dz + [G'(x + 0,x) — G'(x — O,x)]0(x) 

LV/1U 

" + pV — qu = L(u) = f (pG" + p'G' — qG)<j>{z)dz - 0(x) 



14.7 


LINEAR INTEGRAL EQUATIONS 


536 


Requirement c causes the first term on the right to vanish. Hence, we 
have established (39) and completed the proof that G(x,z), calculated as 
described, is the kernel of (45), and that the latter is equivalent to (39) 
and its boundary conditions. 

An important consequence of the properties of Green’s function is 
that it is symmetric. The proof proceeds as follows: Let us integrate 
the identity 

vL(u) — uL(v) = — [p(vu f — to/)] 
ax 

This results in a relation known as Green's formula : 

H = J [ vL(u ) - uL(v)]dx = lp(vu' - uv% = S* (14-46) 

Now let G(x,Zi) = v; G(xjZ 2 ) = u; and consider the three ranges a < x 
< z t ; Zi < x < Z 2 ; z 2 < x < b. Evaluate the integral, dividing it into 
three parts a, z\ — 5 ; Z\ + z 2 — 5 ; z 2 + 5, 6, where 5 is a small increment 
which will approach zero in the limit. 

We thus may write 

i b a = sr* + sz+i + sU 

-SS-SJ+J - S%±\ (14-47) 

According to (46) /„ = and both must be zero, because from c, L{u) 
= L(v) = 0. This in turn requires that l\ — 0 and S b a = 0 since other- 
wise Green’s function will not satisfy the boundary conditions. If in (47) 
we let 5 — > 0 and use (46) we obtain 

0 = —p{Zl){[v{Zl)u' (Z\ )—v' (zi +0) W (zi )] — [t; (Zi ) (zi ) — (zi — 0)u (zi )]} 

-p(Z 2 ){[v(Z 2 )u' (z 2 +0) -v' (z 2 )u(z 2 )]~[v(z 2 )u' (z 2 ~0) -v' (z 2 )u(z 2 )]} 

In writing these equations it must be remembered that u and v are continu- 
ous for the whole range while u f is discontinuous only at z 2 and v f only at z\ } 
so that for example u f {z\ + 0) = u f (zi). Finally from (40) we obtain 

u(zi) = v(z 2 ) 

or 

G(z u z 2 ) = G(z 2 ,zi) 

Since the points z\ and z 2 are arbitrary we write in general 

G(x f z) - G(z,x) 

The symmetry of Green’s function is of considerable importance, since it 
permits application of the Hilbert-Schmidt theory. 

It frequently happens that the two constants A and B of (44) cannot 



537 


green’s function 


14.7 


be adjusted to satisfy the given boundary conditions. In this case, a 
modified Green function^ can be found in the following way. Suppose uq{x) 
is a solution of (41) that satisfies both boundary conditions. Then cuo(x) 
will also satisfy the conditions. No loss of generality occurs if we deter- 
mine the constant so that w 0 (z) is normalized, 

J ul(x)dx = 1 

and we shall suppose that this is done. We now set 

L(u ) = u 0 (x)uq(z) 

and determine a function G(x,z) that has the same properties as we required 
of the simple Green function, except that it satisfies the equation L(G) 
= u 0 (x)u 0 (z ) instead of L(G) = 0. We finally require that 

J G(x,z)uo(x)dx = 0 (14-48) 

The resulting modified Green function, which is symmetric, satisfies the 
inhomogeneous differential equation (39) including its boundary conditions. 
The proof of these facts is similar to that used in the case of the simple 
Green function. 

Problem. Find Green’s function for L(u) = u n with u(0) - u( 1) - 0. Hint: 
letui(s) = x; U 2 (x) = 1. 

Am.: See Table 1, sec. 14.9. 

Example. Suppose L(u) — u n — 0; w(l) = u(— 1); i/(l) = u'(— 1). 
If we substitute the two linearly independent solutions of the preceding 
problem in (44) we see that dG(x y z)/dx — + A , hence the second 

boundary condition cannot be satisfied. A solution of the differential 
equation which does satisfy the boundary conditions is u 0 = constant or 
when normalized uq(x) — 1/V2. Hence we seek a solution of the equa- 
tion L(u ) = u n = u 0 (x)u 0 (z) = This is u = x 2 /4. Using (44) and 
the results of the last problem we see that 

G(x,z) = ± — - — - + Ax + B + — 


which gives A - — z/2 when the further condition G(x,z) — G(—x,z ) is 
imposed. Omitting the constant factor u 0 (x) = 1/V2 we now determine 
B so that (48) is satisfied. This requires that 


+B+ 

9 A different procedure is possible in some cases; see Lovitt, loc. cit. 


dx - 0 



14.8 


LINEAR INTEGRAL EQUATIONS 


538 


The result is B = £ + z 2 /4, so that finally 


G(x,z) = rfc 


(* “ *) , 
~~ 2 ~ + 


_-_ *) 2 , 1 
4 6 


This will satisfy all of the boundary conditions. 

14.8. The Inhomogeneous Sturm-Liouville Equation. — Having proved 
that we can convert (39) to an integral equation, we wish to give explicit 
forms of the latter for different <j)(x ). Suppose 


<l)(x) ~ \wu — x(%) 

so that (39) becomes 

L(u) + \wu = xfa) 
The resulting integral equation is 


u(x) = X J G(x,z)w(z)u{z)dz + g(x) 
g (x) = ~ f G(x,z) x (z)dz 


(14-49) 


(14-49a) 


which is equivalent to (49) and its boundary conditions. Finally if 
x{x) — 0 , the homogeneous differential equation 

L(u) + \wu = 0 (14-49b) 

and its boundary conditions become equivalent to 

u(x) = X J G(x y z)w(z)u(z)dz (14-50) 


but the kernel in this case is not symmetric unless w(x) = 1. If that is 
true (50) is a homogeneous integral equation and can be solved by the 
methods of sec. 14.3b. If w(x) 1, we may introduce a new unknown 
function 

y(x) - u(x)\/w(x) 

multiply the integral equation by and obtain 


y(x) = X J H(x,z)y(z)dz 

where we now have, a symmetric kernel H(x y z) = G (x y z) y/w (x)w(z). 
Eq. (49b) forms the basis of the Sturm-Liouville theory which was dis- 
cussed in sec. 8.5. 

Let us consider (41) and (49b) further. We write 
L(v) + \v = 0 ; L(u) = 0 


(14-51) 



539 


SOME EXAMPLES OF GREEN’S FUNCTION 


14.9 


and suppose that their Green functions are known so that 

u = G(x,r }) ; v = r(rc,f) (14-52) 

Substitute these relations in Green’s formula (46), use (40) and arguments 
similar to those which proved that Green’s function is symmetric. The 
result is 

r(i?,0 = Gtt,u) + x f G(x, v )r(x,t;)dx 


For fixed £, this is recognized as identical with (2) where T is the unknown, 
G(£,t?) = f(y) and G(x,y) is the kernel. If we now change x, y, { to z%, x y z 
and remember that the kernel is symmetric we obtain 

r(rr,z;X) — G(x,z) = X J G(x,zi)r(z x z;\)dzi 


which shows by comparison with (10) that r(a; f z;X) is the resolvent of the 
kernel G{x } z\). We may thus use equations of the form of (2) or (10) to 
find the solution of either form of (51) when the appropriate Green func- 
tion (52) is known. Finally, referring to (17) and the result of Problem b, 
sec. 14.3, we see that 


D'(\) d In Dpi) 
D(\) “ d\ 



F (x,x)\)dx 


which will give D(X) by integration over X and hence the eigenvalues from 
the relation D(X) = 0. 


Problem. Find Green’s function for L(u ) = u " + k 2 u with the boundary condi- 
tions of the previous problem. Hint: take ui(x) = cos kx; u%(x) = sin kx. 

^n$.: See Table 1. 


14.9. Some Examples of Green’s Function. — For convenience of 
reference, we list in Table 1 Green’s function for some important differen- 
tial equations. The following boundary conditions include those most 
often encountered: 


a. 

u( 0) = i*(l) - 0 

b. 

u(—l) = u( 1); w'(-l) = w'(l) 

c. 

u(0) = u'( 1) = 0 

d. 

tt(-l) = u( 1) = 0 

e. 

m(0) = — m(1); u'(0) = -i u'(l) 

f. 

u(0) = «(1) = u'(0) = u'(l) 

g- 

u(x) finite; — <*> < x < <» 



14.9 


LINEAR INTEGRAL EQUATIONS 


540 


When the limits are a and 6, the appropriate Green function G(X f Z) may be 
found from our results by the transformations 


x 


X-a 
b-a ’ 


Z-a 
b — a 


(14-53) 


for if G{x,z) is bounded by (0,1) then G(X 3 Z) is bounded by (a, b). The 
method of calculating Green’s function in each case is identical with that 
described in the preceding sections. When only one equation is given for 
G(x y z) it refers to x < z; for x > z 3 interchange x and z. 

In addition to the results found in Table 1, Green’s function for several 
other differential equations will be given (see also Table 1 in sec. 8.5). 

For the Legendre differential equation 

L(u) - [(1 - x 2 )u']'; -l<x<l 


The boundary conditions are that the solutions remain finite at x = =fcl. 
Green’s function is 


G(x,z) = — $ In [(1 — x)(l + z)] + In 2 — \ (14-54) 


The associated Legendre differential equation is 


[(1 - x z )u'Y - 


m 2 u 

l-* s 


= 0 


and 




m/2 


m ^ 0 


(14-65) 


For m = 0, the proper Green function is (54). 

The zero-th orfor Bessel equation is L(u) = ( xu ')' = 0. With the 
boundary conditions u(l) = 0; u(0) finite 

0(x,z) - — In z (14-56) 

The iirth order equation is 

(xu 1 )' — — u = 0 

x 

and 

0 fe*) - i[($) - <»)*] 


with the same boundary conditions as for the zero-th order equation. 



541 


abel's integral equation 


14.10 


TABLE 1 


L{ u) 

Boundary Condition 

0(x,z) 

1. 

u n 

(a) 

(1 - z)x 

2. 

u" 

(b) 

i(s - z) 2 + i - $ I X - z 1 

3. 

u" 

(c) 

X 

4. 

u" 

(d) 

— £{| X — Z | -b XZ — 1} 

5. 

u" 

(e) 

-J 1* - *1 + i 

6. 

u" 

(g) 

none exists 

7. 

u" + Xu 

(a) 

sin kx sin fc(l — z) 9 

ksink ; * - X> ° 

8. 

u ,f + Xu 

(b) 

2k B ink COak( - X - S + 1) 

9. 

u n — Xu 

(a) 

smh kx sinh fc(l — z) 
k sinh k 

10. 

u" — Xu 

(b) 

cosh k(x - z + 1) 

2k sinh k 

11. 

u" — Xu 

(c) 

cosh kx cosh k (1 — z) 
k sinh k 

12. 

u" — u 

(g) 


13. 

u JV 

(f) 

6 


APPLICATION TO PHYSICAL PROBLEMS 

14.10. Abel’s Integral Equation. — One of the earliest applications of 
integral equations to a physical problem was made by Abel (1823). Con- 
sider a particle which falls along a smooth curve in a vertical plane. Let 
its original position above a given horizontal plane be Zq, its position at 
time t be z and at the end of its fall be z — 0. Let ds be the distance trav- 
elled in tim e dt. Then if the particle moves under no force but mg, the 
force of gravity, its velocity 

t> = $ = V2p(z 0 - z) (14-57) 

dt 

The whole time of descent is 

r T r ds _ 1 s , (z)dz 

J 0 dt J y/2g(zo — z) '\/2gJo Vz 0 — z 

If the shape of the curve is given in terms of z, 

8 = s(z) 




14.11 


LINEAR INTEGRAL EQUATIONS 


542 


then the time of descent may be calculated. The reverse problem studied 
by Abel is to find a curve for which the time T is a given function of x, 
T(zq) = S(zq) (compare the brachistochrone problem, sec. 6.1b). We 
thus wish to find 


<t>(z) = 


*?(.*) 


> 0 


or 


/(*)- f 

' J 0 


1 <t>(z)dz 
Vz 0 - 2 


(14-58) 


which is a Volterra integral equation of the first kind. The presence of the 
singularity at z — z 0 makes it necessary to solve the equation in the manner 
of sec. 14.2c. The details may be left to the reader. 

14.11. Vibration Problems. — a. The homogeneous string treated in 
Chapter 7 was reduced to the eigenvalue problem (cf. eq. 7-33), 

S”(x) + k 2 S(x) « 0 

If we make the proper change of variable so that the boundary conditions 
are 5(0) = 5(1) =0 we see that the differential equation is similar to 
(49b), the boundary conditions lead to Green's function (1) from Table 1 
and the resulting homogeneous integral equation is of the form of eq. (50) 
when X = k 2 and w = 1. 


b. Forced Vibrations . Suppose the string is subjected to a periodic 
force f(x) cos (fit + <5). Then if we set v = 1 in eq. (1) of Chapter 8 we 
have 


d 2 U 

dt 2 


TJ ff +/(x) cos (fit + d) 


(14-59) 


with boundary conditions [7(0,0 = [7(1,0 = 0. We seek a solution of 
the form 

' U = S(x) cos (fit + 8) 

which reduces (59) to 

S"(x) + fi 2 S(x) = -/(*) (14-60) 

if we remember that 5(0) = 5(1) = 0. This differential equation is like 
(49) and the integral equation like (49a) with kernel identical with that 
of the homogeneous string. The integral equation may be solved provided 
fi 2 is an eigenvalue and/(x) is orthogonal to the eigenfunctions of the homo- 
geneous equation. We know from Chapter 8 that the latter are sin nwx , 
hence the required condition is 

fm sin mrxdz = 0 



543 


VIBRATION PROBLEMS 


1411 


If p 2 is not an eigenvalue, solutions are still possible. Following the pro- 
cedure of sec. 14.8, we look for Green's function of eq. (60) which is given 
as item (7) in Table 1. This is the resolvent of our integral equation, 
hence from eq. (10) the unique solution of (60) is 

S(x) = g(x)+ p 2 f T(x,z)g(z)dz 

J o 

g{x) = — f G(x,z)f(z)dz 

c. The Suspended Rope . Let a rope of unit length hang in its equi- 
librium position from the point x = 1. If it executes small vibrations in 
a vertical plane, its equation of motion is 

d 2 U _ d (xdU) 
dt 2 dx dx 

with U as its displacement. The horizontal component of its tension at x 
is x(dU/dx), so the boundary conditions are £/( 1) = 0, 1/(0) finite. Writ- 
ing V — u(x)<f>(t) we obtain 

[xu'(x)Y + k 2 u(x) = 0 

<t> n {£) + k 2 v(t) = 0 

The proper Green function for the homogeneous differential equation in x 
is eq. (56). 

REFERENCES ON INTEGRAL EQUATIONS 

Chapters on the subject may be found in: 

Oouraut, It., and Hilbert, D., “ Methods of Mathematical Physics,” Vol. X, First English 
Edition, Revised, Interscionce Publishers, Inc., New York, 1953. 

Horn, J., “ Partielle Differentialgleichungen,” do Gruyter, Berlin, 1929. 

Kowalewski, G., “ Determinantentheorie einschliesslich der Fredholmschen Determinan- 
ten,” Third Edition, Chelsea Publishing Co., New York, 1942. 

Morse, P. M., and Feshbach, H., “ Methods of Theoretical Physics,” 2 vols., McGraw- 
Hill Book Co., Inc., New York, 1953. 

Murnaghan, F. D., “ Introduction to Applied Mathematics,” John Wiley and Sons, Inc., 
New York, 1948. 

Whittaker, E. T., and Watson, G. N., “ Modern Analysis,” Fourth Edition, Cambridge 
University Press, 1927. 

More extended treatments are: 

B6cher, M., “ Introduction to Integral Equations,” Cambridge Mathematical Tracts, 
No. 10, 1909. 

Hamel, G., “ Integralgleichungen,” J. Springer, Berlin, 1937; Edwards Brothers, Ann 
Arbor. 

Hcllinger, E., and Toeplitz, O., “ Integralgleichungen und Gleichungen mit. Unondlichen 
Unbokanntcn,” Chelsea Publishing Co., New York, 1928. 



LINEAR INTEGRAL EQUATIONS 


544 


Kneser, A., "Integralgleichungen uml ihre Anwendung in dor Mathematischen Physik,” 
Second Edition, View eg, Brunswick, 1922. 

Kowalewski, G., “ Integralgleichungen,” deGruyter, Berlin, 1930. 

Lovitt, W. V., “ Linear Integral Equations,” McGraw-Hill Book Co., Inc., New York, 
1924; reprinted by Dover Publishing Co., New York, 1950. 

Muskhelishvili, N. I., translated by J. R. M. Radok, “ Singular Integral Equations,” 
Groningen, 1953. 

Vivanti-Schwank, “ Lineare Integralgleichungep,” Helwingsche Verlagsbuchhandlung, 
Hannover. 1929. 



CHAPTER 15 

GROUP THEORY 

PROPERTIES OF A GROUP 

Group theory has become so vital a part of modern physical and 
chemical analysis that the inclusion of its basic structure seemed inevitable 
to the authors of this book. Because of the great volume of available 
material arbitrary selection had to be made, and many proofs had to be 
omitted or given only in outline. Care has been taken, however, to insure 
that the attentive reader of the present chapter will be able to familiarize 
himself with all the tools needed for handling the simpler problems of 
group theory, such as those arising in quantum mechanics and in the field of 
molecular structure. A certain amount of material, easily obtained by the 
methods discussed in this chapter, but of somewhat lengthy derivation, has 
been collected at the end in Table 7. 

16.1. Definitions. — A group 1 is a set of abstract elements A, B, C , 
finite or infinite in number, with a law of combination for any two elements 
A and B to form a product 2 AB such that: 

a. Every product of the two elements and the square of every element 
is a member of the set. 

b. The set contains a unit element E for which EA = AE = A for 
every member of the set. 

c. The associative law holds: A(BC) = (AB)C. 

d. Every element has an inverse , X = A -1 , so that AX ~ AA~ X = 
A- 1 A = E. 

The set of all integers, positive, negative and zero, forms a group if the 
law of combination is addition. The unit element is zero and the negative 
of every element is its inverse. These numbers do not form a group if the 
law of combination is multiplication. In this case, E — 1, but the element 
zero has no inverse hence (d) cannot be satisfied. For any law of combina- 
tion, we always speak of a product and write the two elements as if they 
w'ere multiplied together. 

1 For general treatises on group theory, see references at end of this chapter. 

2 Following the convention of sec 10. 10, it is to be understood throughout this 
chapter that the elements of a product are to be taken in the order from right to left, 

545 



15.2 


GROUP THEORY 


546 


A finite group of order g contains a finite number of elements, g. A 
simple example of such a group (of order four) is furnished by the numbers 
±1, doi. If n is the smallest integer for which X n = E, n is called the 
order of the element X . The n elements X, X 2 , A 3 , • • •, X n 1 , X n = E 
form the period of X, indicated by { X} . The period of a single element is 
thus a finite group ; it is called a cyclic group. 

All of the groups so far mentioned have the property that AB = BA 
for every element. When this condition is fulfilled, the group is said to be 
Abelian . Two or more cyclic groups (they are also Abelian) may be com- 
bined to form a single group which is non-Abelian. Suppose 

A 3 = E; C 2 = E) CA = A~ l C (15-1 J 

then the group, which we designate by D 3 (for reasons which appear later) 
is of order six with elements E, A, A 2 , C, AC, A 2 C. The products of these 
elements may be arranged in a multiplication table; CA , for example, is 
found at the intersection of row C and column A. If we let A 2 = J5, 
AC = D, A 2 C = F and use (1) we obtain for the group D 3 



E 

A 

B 

C 

D 

F 

E 

E 

A 

B 

c 

D 

F 

A 

A 

B 

E 

D 

F 

C 

B 

B 

E 

A 

F 

C 

D 

C 

C 

F 

D 

E 

B 

A 

D 

D 

C 

F 

A 

E 

B 

F 

F 

D 

C 

B 

A 

E 


It should be noticed that each element occurs once and only once in each 
row or column. 

Problem a. Use (15-1) to derive the multiplication table of (15-2). 

Problem b. Show that if any element occurs more than once in a row or column 
of a multiplication table for a group then the group postulates (a)-(d) could not be 
fulfilled. 

15.2. Subgroups. — A group whose elements are contained in another 
group is called a subgroup . Thus we may always find subgroups in any 
group by forming the period of each of its elements. For example, in D 3 
a subgroup of order three is obtained from { A} = {5} = E, A, B. Simi- 
larly, three different subgroups, each of order two, may be found : { C} = E, 
C; { Z>} = F, D; {F} = E } F. In addition to these subgroups, the single 
element E is a subgroup of order one while the group itself is a subgroup of 
order six. In this case, each subgroup, except the group itself, is cyclic. It 
does not follow, however, that all subgroups are cyclic. 

Suppose a given group is of order g and a subgroup of it is of order h 



547 


CLASSES 


15,3 


with elements A i, A 2i • • *, A a. Now take B , an element of the group which 
is not contained in the subgroup, and form the products BA U BA 2) * • • 
BAh • These must all be‘in the group but none can be in the subgroup, for 
if BAi = Aj were one of the members of the subgroup then B - AjAf 1 
would also be in the subgroup which is contrary to our assumption concern- 
ing the selection of B . We have now found 2 h members of the group. If 
2/i < g, it will be possible to find a new element C contained neither among 
the elements A u • • *, Ah nor among the elements BA 1 , • • *, BA h- Repeat- 
ing the operations of multiplication and using the same arguments as before 
we obtain h new elements CA 1 , • • •, CAh . Since the group is of finite order, 
the procedure must end when we have found kh = g elements ( k an inte- 
ger). It thus follows that the order of the subgroup must be a divisor of 
the order of the whole group. In the example of the preceding paragraph, 
we see that we have found all possible subgroups since the only divisors of G 
(the order of the group) are 1, 2, 3, and 6. 

16.3. Classes. — Let A, B and X be any three elements of a group; 
then if B = X~ x AX , B is said to be the transform of A by the element X ; 
A and B are conjugate to each other. The following properties of conju- 
gate elements may be proved ; it is easy to verify them for D 3 by the use 
of the group table (2). 

a. Every element is conjugate with itself, 

b. If A is conjugate with B, then B is conjugate with A. 

c. If A is conjugate with both B and C, then B and C are conjugate 
with each other. 

The complete set of elements C = A u A 2) • • ♦, A r , which are conjugate 
with each other, is called a class of the group. If the group contains the 
elements A x (= E ), A 2 , • • *, A g the class of A may be found by calculating 

E~ l AE = A , As'AA 2 , ■ • •, A~ x AA g 

although not all of these elements will be distinct as may be seen from the 
.following example. Clearly (?i = E always forms a class by itself. In 
(2), e 2 = A, B, for 

E~ X AE = A; B~ X AB = A; D~ l AD = B 
A~ X AA = A ; C-'AC = B; F~ l AF = B 

Similarly <? 3 -= C, D, F. By arguments similar to those used in discussing 
subgroups it follows that the whole group may be separated into a number 
of different classes none of which contain any elements in common. More- 
over, if there are h elements of a group which, transform a given element into 
another element of the same class, then the number of elements in that 
class r = g/h where g is the order of the group. 



15.6 


GROUP THEORY 


548 


15.4. Complexes. — A set of elements from a group, considered as a 
whole, is called a complex. If the complex Q contains A, 5, C then CQ 
contains CA y CB , C 2 . By the product of two complexes (3£B we mean the 
product of every element in (5 with every element in £B, but products occur- 
ring more than once are only taken once. By the complex § we mean the 
whole group. If X is a subgroup, then 


XX = X 2 « X (15-3) 

If X is an element of g not contained in X then the complex XX is called 
a right coset (. Nebengruppe ) and XX is a left coset. Cosets are not groups 
since XX does not contain E. It is easy to see that if another element F is 
neither in X nor in XX, then the coset XF will contain no element common 
with X or XX, so that the whole group may be written as a sum of a finite 
number of cosets 


g = X + XX + XF + XZ + • • • 

The group may also be divided in this way by means of left cosets. In 
I>3, we may write 

g = X + XC = X + XD = X + XF = X + CX = X + DX 
= X + FX 


where X = E } A, B. The index of a subgroup equals the order of the 
group divided by the order of the subgroup. It also equals the number of 
complexes obtained by splitting a group into that particular subgroup and 
its cosets ; two, in the example just given. 

15.5. Conjugate Subgroups. — If a subgroup X contains the elements 
Hi (= E) ) H 2y • • •, H h then it also contains EHj = Hj , H 2 Hj, • • *, HhHj 
for every Hj inX, and it contains HJ l E = HJ 1 } HJ l H 2) * * *, HJ l Hh. In 
fact these arrangements of the h elements of X are identical except for the 
sequence in which the members are written. Still another arrangement is 
HJ x EHj — E , H~ l H 2 Hj , • • •, Hf x HhHj. To see this, sort out the arrange- 
ment EHj = Hj ) • * •, HhHj so that the natural order H i} H 2) • • *, Hh is 
regained and multiply each element by HJ 1 . A similar argument will 
show that for X, any member of the group (not necessarily contained in X) 
X^XX is also a subgroup, but X'Xl and X, called conjugate subgroups , 
may be different if X is not in X. When X and X -1 XX are identical for 
every X in the group, X is called an invariant subgroup or a normal divisor. 
To illustrate these statements choose X = E, C and X = E y A, B from 
D 3 . It is easily verified that the only invariant subgroup of D 3 is X = E } 



549 


ISOMORPHISM 


16.6 

A } B . The invariant subgroup and its cosets form a group 3 called the 
quotient (or factor) group with the invariant subgroup as unit element. In 
D a ,if 7 =3 CC, then the multiplication table of the quotient group 9/3C is 



X 

7 

X : 

X 

7 

7 

7 

X 


( 16 - 4 ) 


16.6. Isomorphism. — Two groups § and are said to be simply iso- 
morphic if to each element A, B, C } • • • of 9 there corresponds an element 
A', B\ C r , • • • of §' so that if AB = C , then A l B f = C' for every product. 
In the general case, two or more elements of one group may be isomorphous 
with a single element of another group. Thus the quotient group (4) is 
multiply isomorphous with D 3 , for 3C corresponds to E, A, B and 7 to 
CyDyF. 

In order to find a group which is simply isomorphous with D 3 , we con- 
sider the nl permutations of n symbols. By {ached) we shall mean a 
replaced by c, c replaced by 5, b by e, e by d and d by a. This may also be 
written as (1 bedac ) or (dacbe) as long as we do not change the cyclic order of 
the symbols. When a single letter occurs in a parenthesis, that letter is 
unaffected by the permutation, hence we will write ( bce)(a)(d ) as {bee). 
By the product of two permutations, we mean the permutation directed 
in the right parenthesis followed by the permutation in the left parenthesis. 
For example, in the product {ached) (6ce), b is replaced by c and then c by b , 
the net result for b being that it returns to its original position. Continuing 
in this way, we obtain {ached) {bee) - {cda). If we use only three letters 
and write 

E = (a)(6)(c) B = {abc) D - {ac){b) 

A = {acb) C =■ {a) {be) F = (a6)(c) 

the resulting operations form a group which is simply isomorphic with D 3 , 
for 

AB = {acb) {abc) = (a)(6)(c) = E 
BC = {abc) {be) = (a6) = F; etc. 

Problem. Derive the complete multiplication table for the group of permutations 
on three letters. 

3 Note that the elements of the quotient group are complexes, i.e., collections of the 
original elements of 9. 



16.7 


GROUP THEORY 


550 


16.7. Representation of Groups. — If to every member of a group 
Ai, A 2 , A 3 , * • •, we can associate a square matrix D(A\), D(A 2 ), D(A 3 ), • * • 
in such a way that if A*Aj — A k and D(At)D(Aj) - D(Ak), then the 
matrices themselves form a group isomorphous with Q. Such matrices 
are a representation of the group ; their order is the degree or dimension of 
the representation. One trivial example of a representation is the unit 
matrix E associated with every element of the group. A representation 
for D 3 may be obtained from its quotient group if we associate DC with the 
matrix [1] and 7 with the matrix l — 1 j. 

To find another representation of D 3 , let us think of thg symbols 
a, 6, c as the components of a vector x and the elements of the group as 
operations which change x into a new vector x' with the same components 
but in a different order. Hence the required representation D will be a 
matrix such that x' = Dx where the rows and columns are labelled with the 
components a, 6 , c. Now E is the operation which replaces each component 
by itself so D(E) is the unit matrix. On the other hand, A replaces a by 
c, but a itself becomes 6, etc., so unity will appear in D(A ) at the inter- 
section of the a-th row and the 6-th column, etc. Continuing in this way, 
we find. 



'l 

0 

o" 


'o 

1 

0~ 


"0 

0 

r 

D(E) = 

0 

1 

0 

; D{A) = 

0 

0 

1 

; d(B) = 

i 

0 

0 


_0 

0 

1_ 


.1 

0 

0_ 


_0 

1 

0_ 


“l 

0 

oi 


'o 

0 

f 


"o 

1 

o' 

D(C) = 

0 

0 

1 

; D(D) = 

0 

1 

0 

; d(F) = 

1 

0 

0 


0 

1 

oj 


_1 

0 

0 


.0 

0 

1_ 


(15-5a) 


By multiplying the matrices together, it will be seen that the multiplica- 
tion table (2) is reproduced. For example, D(A)D(B) = D(E ) and 
D(A)D(C) = D(D). Thus (5a) is a representation of D 3 . 

Suppose a representation of a group has been found, consisting of 
matrices D = 'D(Ai), D(A2), • • •, D(A g ) } each matrix being of dimension 
n. Then it is often possible to find a new coordinate system, i.e., a trans- 
formation of the type Q~ l DQ, such that every matrix D is changed to the 
form 



(15-6) 


where is of order m, m < n and D 2 is of order (n — m). Under these 
conditions, the representation D is said to be reducible 4 into Z>i and D 2 . 

4 In the more general case, the matrices are converted to the triangular form of 
(10-40). If the form obtained is that of (6), the representation is said to be completely 
reducible . 



551 


REPRESENTATION OF GROUPS 


15.7 


We now examine 2>i and Z >2 to see if they are reducible, continuing until D 
is completely reduced. When this has been accomplished, we will have a 
relation between the original and final coordinate systems such as z = Qx 
and 

q~ 1 dq - dia g [r (1 \ r (2 \ . • r ( ‘>] - r (15-7) 

where the F (t) are themselves matrices. 

It should be understood that if there are g elements in the group, there 
will be g equations like (7), one for each element; D means the set of g 
matrices in the original coordinate system and T means the same matrices 
in the new coordinate system. Suppose there are s irreducible representa- 
tations in (7), F (1) , F (2) , • • *, F 00 ; each one of these is a set of g matrices, 
one for each element of the group, = r Cj) (Ai), r 0) (i4. 2 ), * * •,F 0) (A £/ ). 
Each r (j) is isomorphous with the corresponding D in the original coordi- 
nate system since the two sets of matrices are related to each other by a 
collineatory transformation (cf. sec. 10.11). 

It may happen that some may appear more than once or not at all 
in the reduction of a given representation. To indicate this, we rewrite (7) 
as 

F = ciF (1) + c 2 F c2) + • • • + c,r (8) (15-8) 

v 

where the c’s are positive integers or zero. Such an expression, called the 
direct sum, is not meant to imply that the T (J) are to be added. It is 
simply a shorthand method of showing that the matrices D have been re- 
duced to the form (7). 

It is of considerable advantage to choose unitary or orthogonal matrices 
as the representations of groups and we shall suppose that this is always 
done. Under these conditions the following statements may be proved.* 
Two irreducible representations will be orthogonal, and if dj is the dimen- 
sion of T U) , then 

rrOT* = TTTTm M.A- ( 15 ~ 9 ) 

4 \U'iU'j ) 

the summation to be made over the g elements of the group, A i, A%, ■ ■ ■, A g . 
Moreover if there are s classes of elements in a group, there will be exactly 
s different irreducible representations and 

dl + dl+--- + d* = g (15-10) 

It is not always possible to obtain all s of the irreducible representations 
from a single set of reducible matrices D since some of the c y in (8) may be 
zero. If this is the case, another set of matrices D' must be found and 
these must be reduced in the same way until the complete set is obtained. 

5 See texts on group theory cited at end of this chapter. 



15.8 


GROUP THEORY 


552 


16.8. Reduction of a Representation. — We now wish to show how it is 
possible to find all of the irreducible representations for D 3 . Since g = 6 
and s = 3, it follows from (10) that they are of dimension 1, 1 and 2. To 
find the two representations 6 of degree one, we consider the quotient group 
(4) with two classes, C + containing !JC and containing 3\ Its two 
representations are r (1) (C + ) = r (1) (C") = 1; r (2) (C + ) = 1; r (2) (£”~) = 
— 1. While these are almost trivial, it is seen that they satisfy all of the 
requirements for a representation of D 3 . They are therefore taken as its 
two representations of degree one. 

In order to obtain the representation of dimension two, we attempt to 
reduce the matrices of (5a). We expect to get, as a result, matrices of 
the form of (6) where D x is either T a) or T (2) , and D 2 is a set of two- 
dimensional matrices. We note that each of the matrices of (5a) is 
orthogonal and from the discussion of sec. 10.17 we see that another real 
orthogonal matrix will reduce any of them to the desired form. The 
columns of the reducing matrix will be composed of the eigenvectors of one 
of the matrices to be reduced. If we choose D(A) we find that its eigen- 
values are 1, where <t> = 2 t/S. Taking linear combinations of the 
complex eigenvectors and normalizing them the result is 3“ 1/2 [1, 1, 1]; 
6~ 1/2 [1, —2, 1]; 2~ 1/2 [— 1, 0, 1]. They form the columns of a matrix 
Q which will reduce each matrix of (5a) by the transformation indicated 
in (7). The diagonal elements will be T (1) and two-dimensional matrices 
which were sought. A typical result is 


QD(A)Q = 


1 

0 

0 


0 

- 1/2 

V3/2 


0 

-V3/2 

- 1 / 2 . 


(15-11) 


Other methods 7 of reducing a given representation may be found. Con- 
sider, for example, the effect of the matrices (5a) in changing a vector 
x into another vector x f by the relation x ; = Dx. In such an operation 
two components of the vector are simply interchanged in their original 
plane or else both of them are transferred to a plane perpendicular to 
the one in which they originally lay. These relations could be examined 
in a new coordinate system in which x\ is along the normal to the plane 

6 The reader will recall that the complex 3C must be regarded as a single element of 
the quotient group © /3C. It is true that 3C is made up of the elements E , A , and B of 
the original group D 3 , but it acts as the unit element of ©/3C. The other element 
of the quotient group, 7, contains the elements C, D, F, of D 3 . 

7 A formal method, based on hypercomplex numbers in a general type of algebra, 
called Frobenius algebra, is described by Speiser, Littlewood, and other references 
cited at the end of this chapter. 



553 


REDUCTION OF A REPRESENTATION 


15.8 


determined by x 2 and £3. Calling the new system y, not necessarily a 
rectangular Cartesian one, x 2 could be taken in the plane of y u y 2 and 
£3 in the plane of y 2i y%. The relation between the new and old coordinates 
is then x = Sy, where x x = y x + y 2 + yz) x 2 = y x - y 2) x 3 = y 2 - y 3 . 
In this new system, x' = Sy' and, since S is non-singular, y' = S~ l DSy, 
according to sec. 10. 13. When the reciprocal matrix is found (note that 
S is not orthogonal) 


S- 1 D(4)S - 


1 

0 

0 


0 0 

-1 1 

-1 0 


with similar results for the remaining matrices of (5a). We note again 
that we have found F (1) and two-dimensional representations. 

Although these two reductions do not give identical results, their 
matrices are related by a similarity transformation and their traces are 
identical as can be seen by comparing S~~ l D(A)S with eq. (11). The 
importance of this property is explained in the next section. 

In the usual case, it is easier to find another representation of the 
required dimension than to reduce one already known. Consider a plane, 
equilateral triangle with apexes labeled a, b, c and located in a Cartesian 
coordinate system so that the coordinates of its apexes are a — (1,0); 
b ~ §•( — 1 , a/3) ; c — -§-(!, a / 3 ). The elements of the permutation 
group (5) will then be seen to correspond with the following operations on 
this triangle: ( E ) identity; (A) rotation of the triangle about the origin 
of the coordinate system through the angle 2r/3 in the counter-clockwise 
direction; ( B ) rotation by 4 t/ 3 in the same direction, or by 2 tt/ 3 in the 
clockwise direction ; ( C ) rotation through the angle 7 r about an axis lying 
in the plane of the triangle and passing through y = 0; (D) a similar rota- 
tion about an axis through y = — 's/dx, which passes through the apex b 
of the triangle; ( F ) rotation aboqt an axis passing through the apex c f 

ovy = V&c. 

Since we are considering a space-fixed coordinate system and we are 
moving the triangle rather than the coordinate system, the appropriate 
two-dimensional matrices for A and B are the transforms of eq s . (52) given 
later in' this chapter, with <j>(A) = 27r/3 and = 4ir/3. Operation C 
merely changes the sign of the ^-coordinates and its matrix is that of 
eq. (61). The two remaining matrices for D and F are easily obtained 
from the multiplication table for the group, for example, D = CB; F = CA. 
They could also be found from geometric considerations. 

We now have the three irreducible representations for D3. The one- 


15.9 


GROUP THEORY 


554 


dimensional representations have been given in the first paragraph of 
this section. Using the matrices obtained from operations with the 
triangle, the abbreviated notations Ji t A, B, . . instead of the more 
explicit forms T t3) (E), etc., and writing c — cos 4> = —1/2; s = sin 4> = 
V 3 / 2 , (jy — 27t/ 3, the two-dimensional representations are given in Table 1. 


TABLE 1 


rj 

0“ 

; A - 

r° 

— s 

; B - 

c 

1 

Lo 

u 





_ —8 

d 

[i 

0 “ 

D = 

r c 


; F — 

r 

C 

-1 

- L 


_s 

-d 


L 

”d 


15.9. The Character. — The task of finding all the irreducible repre- 
sentations of a given group is usually very laborious. However, for most 
physical applications, it is sufficient to know only their trace, a quantity 
called the character 8 in group theory. We shall indicate the trace of F (l> 
by x u) = x u) (Ti), x (l) (A 2 ), etc. A further simplification is afforded by the 
fact that elements in the same class are obtained from each other by a 
similarity transformation, hence the character of every element in a single 
class is identical. This follows from the fact that elements in the same class 
are related to each other by a similarity transformation and, as we have 
shown in sec. 10.11, the trace of two quantities so related is identical. 
Therefore, if we know all the characters of one element from every class 
of the group, we have all of the information concerning the group which is 
usually needed. We shall indicate the particular class to which we refer 
by a subscript, so that the s characters xi l) , xi\ • * *, xi* } refer to the 
2 -th irreducible representation. 

The following properties of the characters may be derived 9 or verified 
using tables of characters given in later sections. 

a. The class = E is always represented by the unit matrix, thus 
xi 0 equals the dimension of the representation and hence must be a divisor 
of the order of the group. We also see from (10) that 

Z [xi°l 2 = 9 (15-12) 

1=1 

8 The character (especially of permutation groups) is treated in detail by Little- 
wood, D. E., u The Theory of Group Characters,” Oxford University Press, 1940. 

9 Cf. Speiser, loc. cit,, Chapter 12. 


555 


THE CHARACTER 


16.9 


If g and s are known, it will usually be found that there is but a single way 
in which this equation can be satisfied. 

b. From (9) it follows that the s characters also form an orthogonal 
system. Summing over the classes we obtain 

= gda (15-13) 

Q~ 1 

where r q is the number of elements in the q - th class. 

c. If S is the character of a reducible representation, then from (8), 
we have 

S = cix (1) + c 2X (2) + • • • + c 8 x {s) (15-14) 

On multiplying this by and summing over q , we obtain, using (13), 

Cj = “ (15-15) 

9 Q- 1 

When the complete multiplication table for a group is known, the follow- 
ing procedure 10 may be used to obtain the characters. First calculate the 
product of all elements in the class Gi by all elements in G k . It will be 
found that the resulting set of elements may be uniquely arranged in classes 
and that the same results are obtained irrespective of whether we multiply 
Gi by G k or the reverse. Now a given class may occur in the products 
several times or not at all. Let us use hikj to indicate the number of times 
the j-th class appears. Then if we abandon our earlier rule for the multi- 
plication of complexes (cf. sec. 15.4) and take each element of the product 
as many times as it occurs, we may write 

- ihikA 

i=i 

where we sum over the total number of classes, s. Having found the num- 
bers hikj it is then possible to find the characters from the relations 

8 

r,r k XiXk = Xi HhikjrjXj (15-16) 

i“i 

where r» is the number of elements in Gi- 

As an example of the use of this equation, we find for D3 

Cf = A 2 , B 2 , AB, BA = 2(?, + <? 2 

eg = 3e x + 3e 2 ; e 2 e 3 = 2 e a 

10 Proof of the statements in this paragraph may be found in Mumaghan, p. 83 r 
Speiser, p. 170, loc. cit. They may be verified by using the multiplication table for 



16.10 


GROUP THEORY 


556 


(The other products are not needed.) Since r\ = 1; r 2 — 2; r 3 = 3, we 
have, 

4x1 ~ Xi(2xi + 2x2) 

9x1 = Xi(3xi + 6 x 2 ) (15-17) 

6x2X3 = 6x1X3 

From (12), we know that xi has the values 1, 1 and 2. Solving (17) with 
each of these quantities in turn we obtain the entries in Table 2. They are 
identical with the trace of the matrices of the last section. 


TABLE 2 




<?2 

e 3 

p(l) 

1 

1 

1 

f(2) 

1 

1 

-1 

P(3) 

2 

-1 

0 


Let us apply eq. (15) to the matrices (5a) and confirm a fact that we 
already know, namely, that these reducible representations contain r (1) 
and r (3) once each but not F (2) . From (5a), we see that Si = 3; S2 ~ 0; 
S3 = 1, hence, using eq. (15), 

ci = (1.3.1 + 2.0.1 + 3.1.1)/6 - 1 

c 2 « (1.3.1 + 2.0.1 - 3.1.1)/6 = 0 

c 3 = (1.3.2 - 2.0.1 + 3.1.0)/6 = 1 

We have shown how a reducible representation of the group D 3 may be 

found (cf. sec. 15.7). Now elements occur on the diagonal of the matrices 
of eq. (5a) only when the symbols are unchanged by the permutations with 
which D 3 is isomorphous.^ Since these diagonal elements are all unity, the 
reducible character of an element of a permutation group is equal to the 
number of symbols unchanged by the permutation. This result is very 
useful, for every group is isomorphous with some permutation group; 

hence when the latter is known it is a simple matter to find S fl . 

/ 

Problem. Derive Table 2 by the method described in the text. 

16.10. The Direct Product. — Two cyclic groups were combined in (1) 
to form a single larger group. We now describe another method of aug- 
menting the order of a group. Suppose S' is of order m with elements A\, 
A 2 , • • •, A m and S'' is of order n with elements B u B 2 , • * •, B n and that 
every A commutes with every B. Then the mn elements A JBj form a group 
of order mn called the direct product of S' and S", S = S' X 8", If the 


1 


557 


THE CYCLIC GROUP 


15.11 


matrices T (A) and F (. B ) are irreducible representations of and S' 7 , then 
their direct product 

F(A) X T(B) - F(AB) 

is a representation of 9. Moreover, if is a character of 8 in §' and 
Xt j) belongs to 6 t in 8", then the st characters of in 9 are given by 

xiP - xJV 

If one or both of the representations F (A) and r(B) are of the first degree, 
the direct product T(A5) is irreducible. * If both are of degree higher than 
one, F(Afi) is reducible. The reduction is very simple provided the table 
of characters for both groups is known, for multiplication of one set of 
characters by another will give a sum of characters already contained in 
the table. This can always be uniquely resolved into its component parts. 
An illustration of such reduction will be given in sec. 15.18. 


SOME SPECIAL GROUPS 

15.11. The Cyclic Group. — If a cyclic group is formed from {A}, 
A n = E and X is any element of the group defined by X = A OT , m = 1,2, 
* * *, n, then X~ l AX = A. It thus follows that every element of a cyclic 
group or any other Abelian group is in a class by itself. Moreover, we see 
from (10) that the n irreducible representations will each be of degree one 
so that each representation is also a character. Now if e = exp(27r i/n), 
then e will be a representation and a character for A and e m will be a charac- 
ter for A m (m — 1 , 2 , • • •, n), since these n numbers will satisfy the multi- 
plication properties of the group elements. Moreover, c 2m will also serve 
as a set of characters for the same reason. In fact the n distinct powers of 
e m (m = 1, 2, ■ • •, n) will give the n characters for each of the n elements. 
They are shown in Table 3. We can simplify such a table by using 
de Moivre's theorem: e p = cos 2k)) / n + i sin 2irp/n. For example, if 
n = 4, the only numbers that will occur are ±1 and ± L 


TABLE 3 



fq 

II 

* 

II 

II 

e 8 - a* 


SI 

fa) 

1 

1 

1 


1 

p(a) 

1 

€ 



€ n ~ 1 

p(m) 

1 

1 




p(n) 

1 

€ *-l 

£ 2(n-l) 





16.12 


GROUP THEORY 


558 


15.12. The Symmetric Group.— Consider a particular permutation of 
five letters which we write as 

p.-(° ‘ ‘ M 

\c e b a d/ 

This is to be interpreted as meaning: a is replaced by c, b by e, c by b, 
d by a, and e by d. A more convenient and equivalent form for such a 
permutation is 

P e = (ached) 


which we have already used in sec. 15.6. The one-line form is called a 
cycle; its degree equals the number of letters in the parenthesis. It will be 
found that any permutation may be written as a single cycle with no letter 
repeated, or as a product of two or more cycles, none of which has a letter in 
common. Provided their proper sequence is retained, the letters in a cycle 
may be rearranged, but the number and degree of all cycles corresponding 
to a given permutation is unique. For example, 

P 0 — | f J = (ac)(bed) = (bed) (ac) — (dbe)(ca), etc. 

\c e a b d/ 

A cycle of degree two is called a transposition . A cycle of higher degree 
may be rewritten as a product of two or more transpositions in several 
different ways, but then the product will contain the same letter or letters 
in two or more parentheses. However, if the original cycle contained an 
even number of letters, the product of transpositions will be composed of an 
odd number of transpositions and if the original cycle contained an odd 
number of letters, the product will have an even number of transpositions. 
Since any permutation may be decomposed into a product of cycles, and 
each of the latter may be written as a product of transpositions, it follows 
that any permutation may be factored into a product of transpositions. 
Moreover, all the different products corresponding to a given permutation 
contain either an even number or ail contain an odd number of transposi- 
tions. This property of a permutation is unique, and permits us to speak 
of even and odd permutations, P e and P 0 . As examples, we see that 

P e = (ac) (ab) (ae) (ad) = (ac)(cb)(be)(ed), etc. 

P 0 = (ac)(be)(ed) = (ca) (be) (bd) } etc. 

The symmetric group of orders! is defined as the group of all permuta- 
tions, both even and odd, of n letters. The set of n\/2 even permutations 
of n letters forms a subgroup of the symmetric group, of order n\/2; it is 
called the alternating group . A simple consideration shows it to be an 
invariant subgroup. The odd permutations contained in the symmetric 


559 


THE SYMMETRIC GROUP 


15.12 


group do not alone form a group, since the product of two odd permuta- 
tions is even. However, the complex of odd permutations is one of the 
elements of the quotient group of order two which is isomorphous with 
the symmetric group, the other element being the complex of even per- 
mutations. 

Problem a. Construct elements and group table for the symmetric group on four 
letters. Decompose all elements into transpositions. 

Suppose a permutation has been factored into a cycles of degree one, 
0 cycles of degree two, etc. We describe this arrangement by the symbol 
(l a 2 /9 3 7 • • •) which is called a 'partition . It is easy to see that any permu- 
tation P and its inverse P" 1 will belong to the same partition, for P~ l 
is formed from P by reversing the order of the letters in the cycles of P. 
Thus 

Po = (ac)(bed); Pq 1 = (ca)(deb) 

It is also true that elements in the same class belong to the same partition 
and that there arc as many classes as partitions (cf. Problem a). Now if 
the total number of letters in a permutation is n, we must have 

a + 2/3 + 3y + • • • = n 


hence the number of possible partitions or the number of classes equals the 
number of distinct solutions of this equation in positive integers or zero. 

In order to find the number of elements in a class we must find the 
number of permutations having the same cycle structure. Suppose there 
are n letters and that the particular class under consideration belongs to 
the partition (l a 2 /S 3 7 • • •)• There are n\ ways of arranging the n letters 
but not all of the arrangements will lead to a different permutation. For 
instance, we may start a given cycle with any letter in it; i.e., ( abc ), ( bca ) 
and (cab) are identical. This fact means that l a 2^3 7 • • • arrangements will 
differ only by cyclic permutation within the various cycles. There is still 
another possibility of duplication. It does not matter whether we write 
(< ab)(cd ) or (< cd)(ab ), hence there are alftlyl • • • interchanges of this kind, 
each corresponding to the same permutation. We thus conclude that the 
number of different arrangements or the number of elements in a class 
symbolized by the partition (l“2 /9 3 7 ■ - •) equals 

n\ 

T = r*a!2*/3!3' v «»! • • • (15-18) 

Application of the methods just described will show that for n = 4, 
there are 5 classes corresponding to the partitions (l 4 ), (1 2 ,2), (1,3), (2 2 ), 
(4). Typical elements of each class are E = (a) (b) (c) (d) ; (< ab ); ( abc ); 



16.12 


GROUP THEORY 


500 


(ab) ( cd ) ; ( abed ). The number of elements in each class is 1, 6, 8, 3 and 6, 
respectively. The complete class of (2 2 ) is C* = ( ab)(cd ); (ac)(bd)] 

(ad) (be). 

Problem b. Verify the statements of the preceding paragraph. 

Two irreducible representations of the symmetric group are found 
immediately from the quotient group, for if the even and odd classes are 
indicated by C 4 * and <?”, we have 11 

r‘'>(e+) . r“>(e-) - 1 

r (li (e M ) - » -l k 

All other irreducible representations are of higher degree. From each one 
of these (and also from F (1) as shown in (20)), a new representation called 
the associated representation can be obtained by forming the direct product 

T U) x f (i) = T U) (15-20) 


Both and F (J) have the same dimensions and . (F (J) ) = F (j) . If 
r (/) = r 0) , the two representations are self -associated. Since F (1) = ~fl 
for even classes and — 1 for odd classes, it follows that 


x^ce 1 -) = x w (<?*•) 
x U) (c~) = -x a He-) 


(15-21) 


In order to satisfy (21), the character of C” for a self-associated represen- 
tation must be equal to zero. 

Provided n < 5, a simple method may be used to obtain the complete 
table of characters for the symmetric group. When n > 5, this procedure 
will not give the characters for all the classes but actually it still gives the 
characters which are of interest for physical problems. 12 The restriction 
on n is not a defect of the theory, since eq. (22) which follows is a simpli- 
fied form of the general polynomial which applies for any value of n. 

Suppose there are in a given class of the group p cycles of degrees 
Ai, X 2 , * • *j A p with Ai + A 2 + • • * A p = n. Then x ^ is the coefficient of 
x k , k < n/2, in the polynomial 

(1 - *)(1 + Z x, )(l + * Xl ) • • • (1+ aty = Xx™x k (15-22) 

k 

The coefficients of the highest power of x k , that is, k = 1 for n = 3 and 
k = 2 for n = 4, are the characters of the self -associated representation. 

11 We originally denoted f (l) by the symbol T (2) . It is convenient here to use a 
different notation in order to show the relation between T C1) and T (2) . 

12 For proof of this statement and a derivation of the method with n < 5, see Wigner, 
E., " Gruppentheorie und ihre Anwendung auf Quantenmechanik der Atomspektren,’’ 
Braunschweig, 1931, Chapter XIIL 




561 


THE ALTERNATING GROUP 


16.13 


We thus obtain from (22) the characters of k representations of the group 
while those of the remaining (s — k) representations are the associated ones 
which may be obtained by using (21). We illustrate the procedure for the 
symmetric group of order 4!. 

For the partition (l 4 ), Xi = X 2 = X3 = X 4 = 1 and the polynomial is 
(1 — x)(l + z) 4 - Since k < nj 2, we take the coefficients of x°, x and x 2 
which are 1, 3 and 2.- The last value, 2, is the character of a self-associated 
representation as previously pointed out. The class under consideration 
is even, hence the associated characters are 3 and 1, completing the first 
column of the character table. For the next class (1 2 ,2), we have 
Xi = X 2 = 1, X3 = 2; the polynomial is (1 — x)(l + x) 2 (l + x 2 ) and the 
coefficients of x°, x and x 2 are 1, 1, 0. The class is odd and the associated 
characters are — 1, — 1. The remaining polynomials are (1 — x)(l + x) 
(1 + x 3 ); (1 — x)(l + x 2 ) 2 and (1 — x)(l + x 4 ). All of the characters 
are given in Table 4. We have added the number of elements in each class 
and indicated by signs the even and odd classes. 


TABLE 4 


Class 

d 4 ) + 

a 2 , 2 )- 

(1,3)+ 

(2 2 ) + 

(4)- 

No. of Elements 

1 

6 

8 

3 

6 

r (i) 

1 

l 

1 

1 

1 

r<2> 

3 

1 

0 

- 1 

- 1 

r (3) 

2 

0 


2 

0 

p(4) _ p(2) 

3 

-1 

0 

-1 

1 

r<5> = pa) 

1 

-1 

1 

1 

~1 


16.13. The Alternating Group. — If two elements A { and Aj of the 
symmetric group are in the same class, it does not follow that they will 
belong to the same class of the alternating group. Any even class of the 
symmetric group which contains none or one cycle of odd order or no cycles 
of even order will split into two classes in the alternating group, each of the 
new classes containing half as many elements as it contained in the sym- 
metric group. For example A and fi of (5) belong to the same class of the 
symmetric group with n = 3, but to different classes of the alternating 
group, as may be verified from (2). 

The characters of the symmetric group which are not self-associated are 
also characters of the alternating group. Every character of a self-associ- 
ated representation is the sum of two equal characters for the alternating 
group except for the two classes which have been obtained by splitting a 
class of the symmetric group. Thus if n — 3 or 4 and the character table 
is known for the symmetric group, we can fill the character table for the 


16.14 


GROUP THEORY 


562 


alternating group except for four blank spaces. Suppose the two classes 
whose entry in the table is blank are obtained from the partition 
(X 1 ,X 2 ,X 3 ,- • •)* Then if p = X 1 X 2 X 3 • • *, the character ( — will 

occur in the symmetric group at the intersection of the row corresponding 
to the self -associated representation and the column of the class in question 
while in the alternating group we will have 

(-1 )(-»/* ±v7-i^ 1,/a _ 

(15-23) 

The two remaining vacant places in the table are filled by interchanging 
the two characters given by (23). 

For n = 4, there are 4 classes since (1,3) + splits into (1,3) ' and (1,3)". 
The self-associated representation is T (3) . Its characters become (l,z,?/,l) 
and ( 1 ,y,z,l) where x and y obtained from (23) are (— 1 db £\/3)/2'sinee 
Xi = 1, X 2 = 3, ix = 3. Writing e = exp (2^/3), we thus have x = e, 
y = € 2 . This completes the calculation as shown in Table 5. 


TABLE 5 


Class 

a 4 ) 

(1,3)' 

(1,3)" 

(2 2 ) 

No. of Elements 

1 

4 

4 

3 

r (i > 

1 

1 

1 

1 

rw 

3 

0 

0 

- 1 

r<3) 

1 

€ 


1 

r(4) 

1 

e 2 

€ 

1 


16 . 14 . The Unitary Group. — The collection of all non-singular matrices 
of order n y with matrix multiplication as the law of combination, is the 
representation of a group called the full linear group (FLG). The order 
of the group is infinite, for its elements are the infinite number of linear 
transformations that change a vector x into a new vector. This group has 
many subgroups obtained by imposing certain restrictions on the matrices 
of its transformations. Thus, we might exclude all matrices except those 
‘with determinant equal to ±1 or we might require that the matrices be 
orthogonal. Such groups are discrete , if the elements are infinitely denu- 
merable (an example of a discrete group of this type is given in sec. 15.1); 
continuous , if the elements are non-denumerable. An example is the 
group of rotations about an axis. One may also have mixed-continuous 
groups such as 1^(2) discussed in sec. 15.16. Infinite groups have many 
of the properties of finite groups, although naturally some modifications 13 
in their treatment are necessary. 

19 See. for example, Winner, loc. cit., Chapter X. 



563 


THE UNITARY GROUP 


15.14 


We first consider a subgroup of FLG, which is called the two-dimensional 
unimodular unitary group (SUG, special unitary group). Its elements are 
square unitary matrices of order two with determinant of +1. Let us 
take a matrix 

[: a 

and modify it so that these conditions are met. Referring to eq. (10-50), 
we see that we must have c = —b* and d = a*. Thus a typical element of 
SUG is 

V = [-6* !*] : 1^1= aa * + bb * = 1 (15-24) 

When this matrix is applied to a column vector x = {# 1 , 22 } so that 
Ux = x ; , we have 

x[ — ax 1 + bz <2 

, * (15-25) 

x 2 = —b*x\ + arx 2 

It will also transform any function of x into a linear combination of X\, 
x 2 ; for example, 

Uf(x) = f(x ' ) = f(ax x + bx 2 , -b*x l + a*x 2 ) (15-26) 

Thus if V operates on a set of (n +1) homogeneous products 

f ( v n) = x1xr"\ (p = 0, 1, 2, • • n) (15-27) 


the result is a homogeneous polynomial of the same degree 
Utf’ = (OX! + 6x 2 ) j, (-5*x 1 +a*x 2 ) n ~ p 

= £ U$x\xT k - (15-28) 

k =0 

Clearly, the two-dimensional matrices U are themselves representations 
of SUG. But the matrices with elements U$ } being isomorphous with V 
because of eq. (28), must also be Representations, provided we can show 
that they are unitary. As a matter of fact, they are not unitary, but if 
each element is multiplied by [p!(n — they become so. Multipli- 

cation of the elements by this constant factor is, of course, equivalent to 
multiplying /£ n) by the same quantity. When we do this, we find it con- 
venient to set n = 2j; p = j + m. The purpose of the latter substitution 
is to enable us to prove in sec. 15.15 that SUG is isomorphous with the 
three-dimensional rotation group. 

When these changes have been made, /£ n) becomes 



(15-29) 



16.14 


GROUP THEORY 


564 


where j = 0, f, 1, $, • • •; m = —j, — 1, j- At the same 

time, eq. (28) becomes 

„, 0 -) = (ax i + bx 2 y +m (-b*x 1 + a *s 2 ) J '~’ n 
V (j -f m)!(j - m)! 

= £ Of (15-30) 

The resulting matrices whose elements are f/^ will be indicated by Z7 (;) . 
They are unitary and irreducible; furthermore, there are no other irreduci- 
ble representations 14 of SUG. 

In order to obtain the elements of U^\ we develop (30) by the binomial 
theorem and pick out the coefficient of It is found to be 

r7 c# = y. (- 1)*^ O' + m) !(j - m) !(j + g) !Q' - 9) ! 
mfl 1 (j - m — t)l(j + q — t)\(t - q + m)\t\ 

X (15-31) 

In this expression, t takes the values 0, 1, 2, • • * and the summation breaks 
off automatically when negative powers of the a’s and Vs appear because 
the denominator will then contain the' factor ( — ) ! which is <» . 

Since m and q have (2 j + 1) possible values, it follows that the matrices 
of the representations have dimensions of (2 j +1). If j = 0, 17 (0) = 1. 
If j — 2 j 171 and q can take the values d=-|, hence if the elements of the 
matrix are characterized by +J and — in that order, we have Z7 (1/2) 
identical with U of (24). 

In order to determine th^ characters of SUG let us select a typical 
matrix of the group and transform it to diagonal form. A unitary trans- 
formation is required and it is certain that among the infinite number of 
unitary matrices in the group, one may be found, say V, that will effect the 
diagonalization 

- [o' < I5 - 32 > 

Finally, since we require | Ui | = 1, the coefficients of Ui may be deter- 
mined 15 

[V*/* 0 1 

v ' - u f 15 - 33 ’ 

All other matrices of the group belong to the same class as U and U u 
for the class is composed of elements which are obtained from each other by 

14 The proof of these facts will be found in Mumaghan, loc. cit., Chapter 3 or Wigner. 
loc. cit., Chapter XV. 

16 The reason for choosing e^ /2 instead of e 1 * will become apparent in sec. 15.15. 



565 


THE THREE-DIMENSIONAL ROTATION GROUPS 


15.16 


similarity transformations or, in this particular case, by unitary transfor- 
mations. Since each matrix is unitary it remains unitary when it under- 
goes such a transformation. We also know that it is only necessary to 
calculate the character of one element from a class; thus using U x 

' x (l/2) _ e iH 2 e -t*/2 


is the character for the representation of degree (2 j +1) = 2. 

Now the matrices U u) 3 which are identical with U\ when j = must 
be transformable in such a way that the characters will be identical for 
j ~ and when this is done the characters should apply to SUG for any 
value of j. If we substitute a = e t4>12 , b - 0 in (31), the result is of diag- 
onal form since all elements disappear unless t = 0 and m = q y 

u<% = e im +8 mg (15-34) 

The required characters for SUG, infinite in number, are thus 

X U) = £ (15-35) 

m = — y 


A simpler form of the last expression may be obtained as follows. Let 
p = e l<t> so that 


x 0) = e 


>(1 + p + p 2 + * • • + p 2 0 - e-** 


.—136 a " P 2m ) 


(i - p) 


Multiply numerator and denominator by e x ^ 2 and use the relation 
sin x — i{e~ xx — e xx )j2; then 


sin (2 j + 1)0/2 
sin 4>/2 


(15-36) 


The irreducible representations and characters satisfy certain orthog- 
onality and normalization conditions 16 as in the case of finite groups, but 
the summations in (9) and (13) are replaced by integrals. 

15 . 15 . The Three-Dimensional Rotation Groups. — Another important 
subgroup of FLG (as well as of the n dimensional unitary group) is the 
n dimensional full, real orthogonal group which consists of all unitary 
matrices with real elements. If we further restrict this subgroup, choosing 
all real unitary matrices with determinant equal to +1, we have the n- 
dimensional proper, real orthogonal group or the rotation group . It should 
be remembered that an orthogonal matrix need hot be unitary but a real 
orthogonal matrix and a real unitary matrix are synonymous terms. For 
the moment, we consider the three-dimensional rotation group R + (3) whose 
elements are real orthogonal matrices of order three. 

10 See Wigner, loc. cit., Chapter XV or Eckart, Carl, Rev. Mod. Phys. 2, 344 (1930),. 



16.15 


GROUP THEORY 


566 

Assume that we have a sphere of unit radius, the center of which coin- 
cides with the origin of a coordinate system OXYZ fixed in space. Now let 
the coordinate of some point on the surface of the sphere be (x,y,z) and 
rotate the sphere in any manner whatsoever leaving its center fixed. The 
new coordinates of the point (x 1 i y , i z t ) will be related to (x } y, z) by some 
matrix R(a,fi } y) which is an element of R + (3). As we have shown in sec. 
9.5, such a rotation may be factored into a product of three plane rotations 
described by the Eulerian angles (< 2 , 0 , 7 ) ; i.e., we may write 

7 ) = R s (y)RAP)X'(«) ( 15 - 37 ) 

where R 2 and R x are rotations about the Z - and X-axes respectively. 

In order to find the representations of R + (3) *we could use a method 
similar to that of sec. 15.14 and study the effect of transforming a function 
of (; x,y 3 z ) by the elements of the group. A simpler method 17 is available 
for we will show that R + (3) is isomorphous with SUG. Since we know the 
representations of the latter, we may use the same results for R + (3). We 
recall, however, that the elements of SUG are two-dimensional matrices 
while the elements of R + (3) are three-dimensional, hence the proof of the 
isomorphism depends upon finding some relation between these two kinds 
of matrices. The problem is an old one which occurred in classical mechan- 
ics; it was solved by Klein and by Cayley, who made use of a special kind 
of transformation in the complex plane . 18 We prefer to proceed in another 
way. 


We first observe that any two-dimensional matrix may be 
linear combination of the four matrices 19 

written as a 

h* 

II 

1 \ 

h-L O 

0 ^ 

1 1 

'■■[Ml '-c .3 «- 

1 1 

O 

J 1 1 

CO 

00 

For example, if 

* 12 1 - 



U21 tf 22 J 


we may write 

H — c\P\ + c 2 P 2 + C3P3 + C4P4 


where 




ci — ( #12 + H 2 i)/ 2 ; C2 = i(Hi 2 — H 2 {)/2 
cs » (flu - i/ 22 )/ 2; c 4 = (H n + H 22 )/ 2 


17 Both methods are discussed by Wigner, loc. cit., Chapter XV. 

18 The details are given by Whittaker, E. T., “ Analytical Dynamics/’ Third Edi- 
tion, Cambridge University Press, 1927, p. 12. The quantities a and b which appear 
in our eq. (24) are identical with the Cayley-Klein parameters. Eckart, loc. cit., and 
Bauer, loc. cit., have used a similar method in the group theory problem. 

19 The first three of these are the Pauli spin matrices, discussed in sec. 11.29. 



567 THE THREE-DIMENSIONAL ROTATION GROUPS. 15.16 

Let us take c x — x, c 2 = y, c 3 = z, c 4 = 0. Then we have, 

H{x y y y z) = xPi + yP 2 + zPs 

- [ 2 . 1 + *1 (15-391 

LX - ty -z J 

Clearly if x, y, z are real, H is Hermitian, Moreover, its trace is zero; 
in fact, any two dimensional matrix with trace of zero may be put into this 
form, P 4 not being needed. If H is now subjected to a unitary transforma- 
tion by the matrix U of (24) its trace is unchanged and we obtain 

H'{x',y',z') = WHU = x'P x + y'P 2 + z'P 3 (15-40) 

If we can prove that the relation between x, y, z and x', y', z' is a rotation, 
we may conclude that the matrices U of SUG perform the same transfor- 
mations as the matrices of the group R + (3) and that the two groups are iso- 
morphous. To do this, we note (see the problem in sec. 10.14) that 
\H\ = \H'\, hence 

x 2 + y 2 + z 2 = x ' 2 + y' 2 + z' 2 (15-41) 

which means that the length of a vector is unchanged by the transformation 
of eq. (40) and the latter must be a rotation. 

Let us study some special forms of the matrix U whose general form is 
given by eq. (24). We first put a = e ia/2 , 6=0, that is, we use U\ ) the 
diagonal matrix of eq. (33). We easily find 

V\PiV\ — cos aPi + sin aP 2 

U\P 2 V X = - sin ccP x + cos aP 2 (15-42) 

ul PgUt = Pz 

With these results, (40) becomes 

x' = x cos a + y sin a 
y f — —x 'sin a + y cos a 


This clearly represents a rotation through an angle a about Z; it may be 
suitably represented by 

r' = R% («)r 


where r' and r are the vectors having components (x',y',z') and (x,j/,z), 
respectively and 


*.(«) 


COS a sin a 0 
— sin a cos a 0 
0 0 1 . 


( 15 — 43 ) 



16.15 


GROUP THEORY 


568 


We have thus identified the element of SUG which corresponds to the last 
factor on the right of (37); it is U x . Obviously, R z ( y) corresponds to a 
matrix like (43) but with a replaced by 7. 

In order to find R x (fi) we take 


U 2 


"cos 0/2 
i sin 0/2 


i sin 0/2 
cos 0/2„ 


(15—44) 


It is obtained by putting a = cos 0/2, b — i sin 0/2 in (24). We now find 

UlP,U 2 = P \ 

U 2 P 2 U 2 = cos (3P 2 + sin (3? 3 
£/|P 3 C/ 2 = — sin /3P 2 + cos f}P 3 


and (40) may be written t' = R x (fi)x; where 


RM = 


1 

0 


0 


0 0 

cos 0 sin 0 
— sin 0 cos 0 


(15-45) 


Our notation, R X (P), is meant to exhibit the fact that (45) represents a rota- 
tion through 0 about X . Thus we have shown that by a proper choice of 
the elements of U, SUG and R + (3) are isomorphous since V — 
Ux(y)U 2 (^)Ui(a) corresponds to #(a,0, 7). 

Let us write Z7(a,0, 7) = U x (y) U 2 (l3)Ui(a) 



0 ‘ 
e -iyl2 


cos 0/2 
i sin 0/2 


i sin 0/2' V a/2 
cos 0/2_ _0 


0 1 

e~ ial2 ] 


• e i(«+ 7 ) Z2 cos pj 2 /2 Sin /3/2"l 

ie %<a ~ y) 12 sin /3/2 c -i(«+t) /2 cog p/2j 


(15—46) 


On comparing this with (24), we see that we have a = e i(ot+7)/2 cos and 
b = ie~ l< - a ~y'> 12 sin 3/2, so that (31) becomes 


(«, js,y) = r 
. * 


( -i)‘- g ^V(i + m)!(j - m)!(j + q)\(j - <?)! 

(j -»'r 0 !(i + g — f)!(f - q + m)!t! 


X e iqa cos 2;_OT+5_2i 3/2 • sin m-9+2( 3/2 • ef* 7 (15-47) 

As before, j = 0, 1, f, • • •. For j = 0, we get U m (a, fiy y) = 1; 

for j = we obtain (46). It may be shown 20 that the matrices whose 
elements are given by (47) are irreducible representations and that there 
are no further ones. The characters of the representations are found from 


20 


See footnote 17. 



569 


THE THREE-DIMENSIONAL ROTATION GROUPS 


16.16 


(35). Remembering that e ±%x = cos x ± i sin x, they may be written as 
x 0) (a) = 1 + 2 cos a + • • • + 2 cos jet ; if j = 0, 1, 2, ■ • • 
x <J ' 1 (a) = 2 cos a/2 + 2 cos 3a/2 + ■ • ■ + 2 cos ja; 

= ••• (15-48) 

Although R + (3) is isomorphous with SUG, the isomorphism is not 
simple. If 0 < a < 4x, 0</3<x, 0<7< 2x, then as a, (3 and 7 take 
all values between these limits, a and 6 of (24) will take all pairs of values 
satisfying the requirement aa* + 66* = 1 once only. On the other hand, 
if a, and 7 are Eulerian angles their limits are 0 < a < 2x, 0 < ft < 7 r, 
0 < 7 < 2x. But the angles occur in (46) divided by 2, hence the trigono- 
metric functions are undetermined with regard to sign. In other words, 
every matrix 7) is isomorphous with two matrices U(ct$ } 7). We 

must thus discard half of the representations of SUG in order to find the 
ones appropriate to R + (3). It is easy to see which ones we want. From 
(47) it follows that 

O* + 2^7) = e 2 ^U%(a,P,y) 

Now when,? is integral, q is also integral, for —j<q< j and then e 2irxq == 1. 
If j were half integral, the identical rotations a and a + 2 tt would have 
representations differing in sign. However, for both 

integral and half-integral j values is a group which is isomorphous with 
U(a,/3j 7) and all matrices V U) (a, 0,7) are representations. This group is of 
importance in the Pauli spin theory. 21 

If we take as elements of an infinite group, all real unitary matrices of 
order three with determinant equal to +1 as well as —1, we have the 
three-dimensional full real orthogonal group ^(3). The quotient group 
isomorphous with it has two elements. The unit element, which is also an 
invariant subgroup of ^(3) contains E and all proper rotations R such as 
(43) or (45) with | R | = +1. The other element of the quotient group 
is an infinite number of improper rotations T with \ T ] = — 1, a typical 
one (cf. sec. 10.17) being 

cos 0 sin <j6 0 

T z ((j>) = — sin <t> cos <t> 0 (15-49) 

0 0 -1. 

The simplest member of the class of T is an improper rotation by the 
angle x, the operation called inversion 


21 Cf. Chapter 11. 


T(x) = / = 


1 0 0 
0-1 0 
0 0 - 1 . 


(15-50) 



18.16 


GROUP THEORY 


570 


It is always possible to find some improper rotation T which will convert 
any other improper rotation T f into an inversion, T~~}T f T = /, just as it is 
always possible to find an inverse to a proper rotation, RT l R f R = E . The 
group 11^(3) may thus be considered as the direct product of R + (3) and the 
group I, the latter having elements E and I. It will have two irreducible 
representations for every value of j , each being of dimension (2j +1). 
The element R has two representations both equal to lfl 3) while T has 
representations, =fc U u \ 

15.16. The Two-Dimensional Rotation Groups. — The two-dimensional 
pure rotation group R + (2) is a subgroup of R + (3). Its elements are the 
proper rotations in a plane perpendicular to a fixed axis. Let R(</>) be one 
of the elements where 0 < 4> < 2r, then if x is a vector with components X\ 
and # 2 ; 'the element R(<t>) may be represented by the matrix C(<£) 

x' = C(0)x; 0 < <f> < 2tt (15-51) 

with 

CM - f (,5-52) 

' L — sm 0 cos <j)j 

If R(4> f ) is another element of the group, which is represented by C (<//), then 

C fo)C(*') « C(<t> + *') = C(<t>')C(<t>) (15-53) 

and the group is Abelian, Referring to sec. 15.11, we see that for such 
groups, each element is in a class by itself and the irreducible representa- 
tions are one -dimensional. Thus (52) is reducible, a unitary matrix of 
eigenvectors of C being required for that purpose since C itself is an or- 
, thogonal matrix. The normalized eigenvectors of C are found to be 

U 2 = ~{l,-f| (15-54) 


and the eigenvalues are e il< *\ These, then, are characters of an irreducible 
representation. However, there are an infinite number of classes, so there 
must be an infinite number of representations for each class. The corre- 
sponding characters may be taken as 

X (m) = m « 0, ±1, ±2, • • • (15-65) 


for each will satisfy the multiplication requirement of the group, as indi- 
cated by eq. (53). 

The two-dimensional rotary reflection group R i (2) is composed of both 
proper and improper rotations. A typical element of it is represented by 
the matrix 




t cos <t> 
~d sin 0 


sin <j> 
d cos <t>. 


(15-56) 



571 


THE TWO-DIMENSIONAL ROTATION GROUPS 


1616 


where d equals the determinant of A(0,d) and may be either +1 or — 1. 
If d = +1, we have a proper rotation with matrix C(0); if d = — 1 , an 
improper rotation, the matrix of which will be indicated by S(0) 


Clearly, 


_ cc 
.si 


cos 0 sin 0 
,sin 0 — cos 0j 


S(0) 

S 2 (0) = S(0)S(0) = E 


(15-57) 

(15-58) 


but the group is not Abelian for C(0 ) and S(<j>) do not commute. In fact, 

= A(d'0 0',dd') (15-59) 


Let us reduce S(0) to diagonal form (cf. sec. 10.17). Its eigenvectors 
are found to be 

Vj = {cos <£/2, —sin 0/2}; v 2 = {sin 0/2, cos 0/2} (15-60) 

and the eigenvalues are ±1. The resulting diagonal matrix, 

'■[o -l] (1M1) 


which corresponds to a reflection through the axis of rotation, is that 
obtained from S(0) when 0 equals 0 or 27 t. It is interesting to observe that 
the matrix of eigenvectors, eq. (60), is actually a proper rotation by the 
angle 0/2. Moreover, 

S(0) = (T" 1 (0/2)crC (0/2) (15-62) 

hence, an improper rotation is equivalent to a proper rotation by the angle 
0/2, followed by a reflection and finally by a proper rotation of 0/2 in the 
opposite direction. 

It will be remembered that every element of the group R + (2) is in a 
class by itself. It does not follow, however, that the proper rotations of 
1^(2) are each in a separate class. Thus the element represented by 
C(0) is in the same class with the element represented by C/(0), since 

S' 1 (0 / )C(0) S(0') = C'(0) 

where 

c '(*). r c ° s * -“"*1 

Lsin 0 • cos 0 J 

and 

c'(0) = C- J (0) = C(-0) 

There are an infinite number of classes as before but each class contains the 
proper rotation by 0 and the proper rotation by — 0. 

On the other hand, all improper rotations are in the same class. If 



GROUP THEORY 


16.17 


572 


S(0) and S(0 ; ) are representations of two improper rotations, we find that 
£r 1 (<^ // )*S(0)C(0 // ) - S(0') (15~62a) 


where <i> — 0 + 20". This could have been inferred from eq. (62), for it 
is a special case of (62a) when we set 4>" = 0/2, 0 = 0 or 2x and <t> f — 0. 

If the representations of eq. (56) are transformed by the matrix of 
eigenvectors (54), the result is 

a ~i 

0 -*»*] (15-63) 

r (m) (^-i) = s^(<t>) - 0 J (15-64) 


with m = 1. However, when m = 0, 1, 2, ■ • • the same matrices also 
satisfy the multiplication requirements of the group. They are irreducible 
except when m = 0. There, we obtain (see problem at the end of this sec- 
tion). 


C (O) (0) = 1; S (O) (0) = 1 
C (Q,) (0) = 1; S co/) (0)=“1 


(15-65) 


A slightly different procedure is sometimes desirable. We see from 
eq. (62) that any improper rotation may always be written as a combina- 
tion of a proper rotation and a reflection. The elements of the group 
could thiLS-be considered as an infinite number of proper rotations and the 
single improper rotation which is represented by cr. When the latter is 
transformed by means of (54) we obtain 

r(0,-l) = [J J] (15-64a) 

Thus the irreducible representations are those of (63) and the single one of 
(64a). When m = 0, we again get (65). 


Problem. Show that both (63) and (64) may be r educed J)o diagonal form with the 


matrix 



16.17. The Dihedral Groups. — An important subgroup of R^(2) is 
obtained by restricting the values of 0. Consider a regular polygon in the 
XT-plane with coordinates of the n comers 

Xk = r cos 2 irk/n) yt = r sin 2Tk/n; (k = 0, 1, 2, • * •, n — 1) 

where r is the radius vector from the origin to the comer. Now if 0 in (56) 
takes the value 2ir/n, the matrix A(0,<f) will transform the polygon into 
itself by either a proper or an improper rotation. The elements of the 



573 


THE DIHEDRAL GROUPS 


16.17 


group will be indicated by C in the former case and by S in the latter. The 
corresponding matrices are 4(2x/n,d) with the appropriate choice of d, 
but we will find it convenient again to use C and S for the matrices, dis- 
tinguishing between the abstract element and its representation by means 
of different type. The whole group is finite and of order 2 n; it is called the 
dihedral group D n . It may be generated by the relations 

C n = E; S 2 = E; SC = (T'S (15-66) 

We now see why the group of sec. 15.1 was called X) 3 . If we let n — 3 in 
eq. (66), we will have the generating relation of eq. (1), provided we 
reletter the elements C and S of (66) so that they read A and C, respec- 
tively. 

Suppose q is an integer; then we may write n = 2g + lifnis odd or 
n = 2q if n is even. There will be (q + 1) classes among the proper rota- 
tions for both n even and n odd. These will correspond to C, C 2 , C 3 , ■ • •, 
C Q , C n = E . For n odd, there will be one additional class, that of S. For 
n even, there will be two classes involving an improper rotation. The 
separation into classes for both cases is illustrated in the problem in this 
section. 

If n is odd, the classes for proper rotations will be represented by 
C (0) and C (0 } of eq. (65) and q matrices of (63) with m = 1, 2, ■ • •, q. The 


TABLE 6 

Tiojdd; q - (n — l)/2; <f> - 2x/n 




6(E) 

<?(C) 

(C 5 ) 

ecs) 



r c 0 ) 

1 

1 

1 

1 



p(O') 

1 

1 

I 

-1 



p(l) 

2 

2 cos 0 

2 cos q4> 

0 



fC2) 

2 

2 cos 2 <f> 

2 cos 2 q<j> 

0 



p(fl) 

2 

2 cos q<i> 

2 cos q 2 <f> 

0 





n even; q = n/2 





em 

<?(C) 


etc*) 

C(S) 

Cos') 

p(0) 

1 

1 

1 

1 

1 

1 

p(0) 

1 

1 

1 

1 

-1 

-1 

p(ff> 

1 

-1 

(-D M 

(-D ff 

1 

-1 

p(fl') 

1 

-1 


(-D ff 

-1 

1 

r<« 

2 

2 cos (f) 

• • • 2 cos (q — l)<f> 

2 cos 

0 

0 

rc« 

2 

2 cos 2 <t> 

• • • 2 cos 2 (q — 1)<£ 

2 cos 2<# 

0 

0 

p«M> 

2 2 cos (q — 

1)^ • • • 2 cos (q — 1)V 

2cosg(g— 1)<£ 0 

0 



16.18 


GROUP THEORY 


574 


class of S is represented by the remaining one-dimensional matrices of (65) 
and q matrices like (64) or (64a). If n is even, the situation is similar, 
except for the case m — n/2 when (63) and (64) become 

c (n/2) w = [ _ J _J]; s (nl2) M = [° J] (15-67) 

This representation may be reduced to give C iq) = — 1; S (q) ~ 1; 
C ( ®) = —1; S (9 ) = — 1. Hence, when n is even there are four repre- 
sentations of degree one and (q — 1) of degree two. The characters of 
dihedral groups are shown in Table 6. 

Problem. Show that if n — 6, the classes of the group are (?(j E) = E; €(C) = C, 
C B ; C(C 2 ) - C 2 , C 4 ; e(C 3 ) = C 3 ; <?(£) = S, C 2 S, C 4 S ; C(S') - CS, C 3 £, C 6 S. 
If n = 5, show that the classes are C(E) ~ E; (?(C) - C, C 4 ; C(C 2 ) - C 2 , C 3 ; 

Q(S) = 5, OS, C 2 S, C*S, C*S. 

15.18. The Crystallographic Point Groups. 22 — By considering all opera- 
tions which transform certain solid geometric figures into themselves, we 
obtain a number of finite subgroups of 72^(3), called the crystallographic 
point groups. They are of considerable importance in the study of crystal 
and molecular structure. We assume that one point of the figure is 
fixed in space so that if we know the position of two more points which are 
not collinear with the fixed point, the position of the figure is completely 
determined. Under these conditions, the only possible types of motion are 
rotations around an axis passing through the fixed point and reflections in 
a plane containing that point. All other motions may be reduced to a 
combination of these two, for as we have seen in sec. 15.16 any improper 
rotation may always be written as a product of two proper rotations and a 
reflection. When the improper rotation is an inversion (i.e., improper 
rotation by the angle ir) a point will be collinear with its original position 
and some fixed point on the axis of rotation, hence an inversion is uniquely 
determined by the position of this fixed point and is independent of the 
position of the axis. The fixed point is # called a center of inversion. 

We thus have four fundamental operations: (a) a proper rotation C n 
by an angle <f> = 2r/n (n is an integer) about an n-fold axis of rotation; 
(b) reflection in a plane, indicated by <r v (subscripts h , d and v refer 

to horizontal, diagonal and vertical planes) ; (c) an improper rotation, S n ; 
(d) inversion, indicated by 7. 

Selected sets of such operations, together with a unit element which 
leaves every point of a figure unchanged, are the elements of the crystallo- 

22 More details about the crystallographic groups are given by Schoenflies, A., 
“ Theorie der Kristallstruktur,” Gebriider Borntraeger, Berlin, 1932, and in other 
references cited later in this chapter. The geometric arguments given here, which we 
do not prove, are discussed in detail by Schoenflies; see also, Rosenthal, J. and Murphy, 
G. M., Revs. Mod. Phys. 8, 317 (1936). 



675 


THE CRYSTALLOGRAPHIC POINT GROUPS 


15.18 


graphic groups. The number of these groups which is of interest is 
limited by the fact that we need to consider only those types of symmetry 
which occur in crystals or molecules. It may be shown, from geometric 
arguments, that crystals in nature may have axes of rotation only for 
71 = 1,2, 3, 4, 6, and this fact restricts the total number of crystallographic 
point groups to 32. Some gaseous molecules may exist with axes n = 5, 7, 
8 and the appropriate character tables are readily found. For the linear 
molecule without a center of symmetry, like CO or HCl , the symmetry 
group is C^y, isomorphous with and R + (2). If a linear molecule has a 
center of symmetry, like H<i or C 2 #2, the groupds D x / t . 

The crystallographic groups may be generated in an elegant way from 
group theory considerations but we present the results without proof. 
Consider first the cyclic groups, designated by C n ( n = 1, 2, 3, 4, 6) and of 
order n. A new group of order 2 n may be obtained by adding n two-fold 
axes of symmetry to C n , in a plane perpendicular to the principal n-foid 
axis of the cyclic group. These are the dihedral groups, D n (n = 2, 3, 4, 6), 
but only four in number since Di duplicates C 2 . Two more, containing 
proper elements of symmetry only, are the cubic groups, T of order 12 and 
O of order 24, having the symmetry of the tetrahedron and the octahedron, 
respectively. 

Of the required 32 groups, 11 have now been found. They contain 
nothing but proper elements of symmetry, so the remaining 21 groups must 
contain both proper and improper elements, which could be planes (im- 
proper rotations by the angle zero), the inversion (improper rotation by r ), 
or rotary reflections (rotation by the angle 2w/n, followed by a reflection in 
a plane perpendicular to the axis of rotation). Let us first add horizontal 
planes of symmetry, an to the groups C„ and require that they be perpen- 
dicular to the principal axis of the proper rotation. The results are C n h 
(n = 1, 2, 3, 4, 6) of order 2 n, but when n is even a center of symmetry also 
exists and the group could also be written as the direct product, C n h = 
C n X I. When <x h is added to D n , we get D nh (n = 2, 3, 4, 6) and again for 
n even, D nh — D n X I, but n = 1 duplicates C 2 a. The two cubic groups 
T h and O h also have centers of symmetry. 

Now add vertical planes of symmetry, -a v to C n , through the n-fold axes 
to get C nv (n = 2, 3, 4, 6) of order 2n. When n = 6, the group is isomor- 
phous with D 3 a. When n = 1, duplication occurs since C lv and Cu are 
identical in configuration, differing only in orientation. Addition of verti- 
cal planes of symmetry to D n adds nothing new, for the planes would coin- 
cide with the existing two-fold axes. However, if the added planes are 
diagonal, <Td and if they bisect the angle of the two-fold axes, we get D n( i 
(n - 2, 3) and T d . When n = 2, the group is isomorphic with C 4v ; 
when ft = 3, — D3 XI; ft = 4 or 6 would require 8-fold and 12-fold 



16.18 


GROUP THEORY 


576 


axes of symmetry, hence they are impossible for crystals ; n = 1 duplicates 

C 2v* 

Three more groups will complete the list of 32. Let us try improper 
rotations, S n but with n even, for an n-fold improper axis implies a proper 
axis, C p (p = n/2). We are thus limited to n = 2, 4, 6. When n = 2, 
S 2 = Ci X I, usually designated C z ; similarly, S 6 = C 3 X I = C 3 *. 
There is no center of symmetry, however, for S 4 . 

For convenient reference, these results are collected in Table 7. The 
first column contains the proper groups (G). The cyclic groups are of 
order n ; the dihedral groups of order 2 n ; T is of order 12 and O of order 24. 
Improper groups (G) on the same line in the table with a proper group (G) 
have the same order as (G) and the groups are isomorphous with each 
other. An improper group (G) X I, which is the direct products of (G) 
and I, has an order twice that of (G). Other relations between the 
various groups would have been apparent if thej r had been derived in other 
possible ways. These relations may also be found by study of appropriate 
solid models or plane diagrams. Stereographic projections of the solid 
figures are suitable for such a study. 23 

The symbols given in Table 7 are those generally used in molecular 
problems and devised by Schoenflies. Some alternative, but lesser used 
symbols, are shown in parenthesis. The dihedral group, D 2 , for example 
is often called V for “ Vierergruppe.” It is that of the Cartesian coordinate 
system with three mutually perpendicular two-fold axes. The meaning 
of the other alternative forms will be obvious. Crystallographers, un- 
fortunately, have used a bewildering variety of systems 24 for designating 
the groups. 

In the tables at the end of this section we present the characters for 
all of these groups. It is convenient to indicate a class by means of sym- 
bols like C n , S n or <r, a typical element of it. If a number precedes the 
symbol it is the number of elements in that class ; otherwise the class in 
question contains but one element. Representations of degree one are in- 
dicated 25 by A or B ; of degree two by E (except for certain cases, where 
two one-dimensional representations occur in pairs); of degree three by T. 

23 They are given, for example, by Eyring, H., Walter, J., and Kimball, G. E., 
" Quantum Chemistry,” John Wiley and Sons, Inc., New York, 1944. Easily under- 
stood perspective drawings of the group symmetries may be found in Davey, W. P., 
u A Study of Crystal Structure and Its Applications,” McGraw-Hill Book Co., Inc., 
New York, 1934. 

24 See Davey, loc. cit. for a discussion of these nomenclatures. 

25 Further description of the designation of representations, especially the usage of 
molecular spectroscopists, is given b> Herzberg, G., “ Molecular Spectra and Molecular 
Structure, Vol. II, Infrared and Raman Spectra of Polyatomic Molecules,” D. Van 
Nostrand Co., Inc., New York, 1945. 



577 


THE CRYSTALLOGRAPHIC POINT GROUPS 


16.18 


When two one-dimensional representations A and B occur in the same 
group, it will be found that the character of A is + 1 for the class represent- 
ing rotation by 2ir/n around the principal n-fold axis and — 1 for B . The 
principal axis is always taken in the direction of Z. Different representa- 
tions of similar symmetry to reflection in a plane perpendicular to the 
principal axis are indicated by 1 and n while subscripts g and u refer to 
positive and negative characters for the class of I . 

TABLE 7 


THE CRYSTALLOGRAPHIC POINT GROUPS 


1 

Proper Groups ! 
(G) 

Improper Groups 

(G) 

(G) X I 

Cyclic Groups 



Ci 

— 

Ci(Si) 

c 2 

C»(C.) 

CiK 

C 3 

— 

CziiSe) 

c 4 

Si 

Ci h 

c 6 

Czh 

Ceh 

Dihedral Groups 



I > 2 (V) 

Czv 

D 2 k(V h ) 


Czv 

Dsd 

I>4 

Civ] DidiVd) 

Dih 

D e 

Cfo] Dzk 

Dsh 

Cubic Groups 



T 

— 

T h 

O 

T d 

Oh 


15.18 


GROUP THEORY 


578 


Methods of finding the characters 26 for cyclic and dihedral groups 
have already been described in detail. The cubic groups O and T are the 
symmetric and alternating groups on four letters; they have been discussed 
as examples of permutation groups. The remaining groups, which are 
indicated as a direct product will have twice as many classes and repre- 
sentations as appear in our tables. Each representation given there will 
occur once with the subscript g and once with u (except for C 3 * where the 
representations are A A", E\ E ff ). For example, C 3 * = C 3 X I will 
have classes E, C 3 , C 3 , 7, 7C 3 , JCf. The classes which are found in C 3l - 
will have the same characters as C 3 , once as g and once as u while the new 
classes will have the same characters as C 3 for ^-representations and the 
negative of those for ^-representations. Groups having the same character 
table are isomorphous. 

For convenience of reference, we also include the infinite group D« 
which is isomorphous with both R + (2) and C ^ and the group D nh = D „ 
X I which is isomorphous with R* (2). 

One further question of interest here concerns the transformation 
properties • of a vector when subjected to the operations of a crystallo- 
graphic group. We have shown, in sec. 15.15, how a vector is transformed 
by the elements of the group R ± (3). The representations from which this 
effect is immediately seen are given by (43) and (49), the characters of 
which are 

H/j — 1 2 cos 4>) Sj 7 — — 1 2 cos 

The same characters must also apply to the crystallographic groups since 
they are subgroups of R ± (3), but it does not follow that the characters 
remain irreducible. As an example, consider the group C 4 where all of the 
classes involve proper rotations. The angles for the classes of E, C 2 , C 4 , Cl 
are 0, x, x/2, 3x/2, hence Sr = 3,-1, 1, 1. Comparison with the charac- 
ter table for C 4 shows that these numbers are the sums of the characters for 
the representations A and E. The reader should draw a figure of the 
appropriate symmetry which in this case is a square. Let the Z-axis be 
perpendicular to the plane of the paper, then it is immediately obvious that 
z transforms like A for z is unchanged by the operations of the group. 
When the operation C 2 is applied to the figure (i.e., rotation by x) x is 
transformed into — x and y into —y, hence x + iy becomes — (x ± iy). 
Proceeding in this way with the other elements of the group, it will be 
seen that x + iy transforms like the first set of characters for E in Table 7 
and x — iy like the second set. For S4, the last two classes are improper 
rotations with St = —1, —1, hence the reducible characters are 3, — 1, 
— 1, — l ot B + E) z transforms like B and x + iy like E. We have indi- 

26 The reducible representations of all of the crystallographic groups are given by 
Seitz, Z. Kristnllographie , A88, 433 (1934). 



579 


THE CRYSTALLOGRAPHIC POINT GROUPS 


16.18 


cated all of these transformation properties in our tables. When two or 
more groups are isomorphous and the representations are the same (exam- 
ples, D 4 , C 4v and or C 4 and S 4 ), the characters for the coordinates refer 
to the first group of that table. To obtain them for the other groups, one 
must change the sign of the characters for the improper rotations, for 
example, z transforms like A 2 for D 4 , like Ai for C 4t , and like B 2 for D 2 d. 


v TABLE 8 
CYCLIC GROUPS 


Ci 

E 

A; x,y,z 

1 


c. 



E 

7 


C 2 


E 

c 2 



C 1ft, 

E 


A g 

A; z 

A'-, x ,y 

1 

1 

A u ; x,y,z 

B ) x,y 

A z 

1 

-1 




C3 

E Cz 


Cl 


i 



A: z 


1 1 


1 





E ; x ±iy 

fi 


£ 






€ 

= e 2T,/3 




’ 



C,u - 

' C 3 X Vh) — 

c 3 xi 



C4 



E 

Ci 

C 4 


Cl 


s 4 


■ 

E 

Ci 

S 4 


Si 


A; 

z 


1 

1 

1 


1 


B 



1 

1 

-1 


-1 





fl 

-1 



i 


E 

; x ± ty 

{1 

-1 

i 


—i 





C4A 

= c 4 X I 





C 6 

E 



c 3 

c 2 


cl 

cl 

A) z 

1 


1 

1 

1 


1 

1 

B 

1 


-1 

1 

-1 


1 

-1 



[1 



— e 

1 


-€* 

— € 

Ez 


11 


— £ 


1 


— £ 



i 

[1 



-< 



-6* 

€ 

Ei) z ±iy 


[i 


£ 


-1 


— € 

€* 



e = 


Ctt = C„Xl 






16.18 


GROUP 

THEORY 





table 8 (Continued) 





DIHEDRAL 

GROUPS 



D, -V 





E 

Ci 

Cl 


C„ 




E 

Ct 

ff 9 



Cn 



E 

Ct 

0h 

Ax 

Ax; z 

Ag 




1 

1 

1 

B»; x 

Bt; v 

B, 



1 

-1 

-1 

B 1; * 

A 2 

A u 'i * 



1 

1 

-1 

Bt; v 

Bx; x 

| B u \ x ± iy 


1 

-1 

1 



d 2 * = 

D 2 X 1 




D t 

| 

E 


2C, 

3 a 



c„ 


E 


2C, 

3<r„ 



•Ai 


1 


1 

1 



At; 

: * 

1 


1 

-1 



E; 

* Aiiy 

2 


-1 

0 





D. XI 



V* 

1 

E 

Ct 


2Cx 

2C2 

2 Ci' 

Ci. 

E 

Ct 


2C t 

2<r v 

2<?d 

Du 

E 

Ct 


2S4 

2Ct 

2 <Td 

A x 

1 

1 


1 

1 

1 

At; z 

1 

1 


1 


-1 

B x 


1 

1 


-1 

1 

-1 

Bt 


1 

1 


-1 

-1 

1 

E; 

x =fc iy 

2 

— 2 


0 

0 

0 



Du ” 

■D«XI 




P» 

c* 

Dm 

E 

E 

E 

Ct 

Ct 

<Tk 

2Cj 

2C, 

2C. 

2 C 6 
2C« 
2iSr s 

ZC't 

3 

3C 2 ' 

3Ca 

3<r, 

3<r w 

At 

Ax; z 

A<x 

1 

1 

1 

1 

1 

1 

A t ; t 

At 

At 

1 

1 

1 

1 

-1 

-1 

Bx 

Bt 

Ai' 

1 

-1 

1 

-1 

1 

-1 

Bt 

Bx 

At; * 

1 

-1 

1 

-1 

-1 

1 


Bt 

S'; x±iy 

2 

2 

-1 


0 

0 


Bx; x±iy 

B" 

2 

-2 

-1 

1 

0 



D«* - D, X I 


581 


APPLICATIONS OF GROUP THEORY 


16.19 


table 8 ( Continued ) 


dihedral ghoups {Continued) 



Dm 

Coo* 



E 

E 

2 CM 

2 C(0) 


c, 



A i; 

z 


1 

1 


1 



A 2 



1 

1 


-1 



Ef, 

x dbiy 

2 

2 cos <t> 


0 



E 2 



2 

2 cob 2 <f> 


0 ’ 



E k 



2 

2 COB k<j> 


0 






— D» X I 








CUBIC 

GHOUPB 





T 


1 E 


3 C 2 4C 3 


4 C' z 



A 


1 


1 I 


1 



E 


j 1 


1 c 


** 





11 


1 


€ 



T; 

3 


-1 0 


0 




4 

— c 

2ir«/3 

T A = T X I 




0 


E 


8C 3 

SC 2 

6C 2 


6C 4 

Td 


E 


8 C 3 

3C 2 

60-4 


6S 4 

At 


1 


1 . 

1 

1 


1 

Ai 


1 


1 

1 

-1 


-1 

E 


2 


-1 

2 

0 


0 

t 2 


3 


0 


1 


-1 

t i; 

z,y,z 

3 


0 

-1 

-1 


1 





Oh - 

O XI 





16.19. Applications of Group Theory. — Since group theory is concerned 
with symmetry properties, its mathematical methods should be useful in 
many physical problems. Its most obvious application consists in the 
classification of crystals and polyatomic molecules according to a group of 
the appropriate symmetry. It is natural to inquire whether group theory 
may also be used in quantum mechanics. For systems containing a num- 
ber of particles, calculations by the usual methods are difficult; hence it is 



15.19 


GROUP THEORY 


582 


fortunate that the symmetry of such problems can be utilized to some 
extent in their study. 27 

The Schrodinger equation for a system of n identical particles (elec- 
tron, protons, etc.) may be written as follows: 

(1,2,* • • n)*( 1,2,* ■ • n) - £*(1,2, • • • n) (15-68) 

where the numbers 1, 2, • • • n appearing in both the Hamiltonian operator H 
and the state function * indicate that these quantities depend on the 
coordinates of particles 1, 2, ■ • • n. Now it is clear that, if the coordinates 
of particles i and j are interchanged everywhere in eq. (68), the latter 
remains valid, for such an exchange amounts to no more than a relabelling 
of the particles. This fact might be indicated formally by applying to (68) 
the operator P a defined as effecting an interchange of particles i and j : 

PijH (1,2,* • • n)PifiK 1,2,* • • n) = EP^( 1,2,* * • n) 

But the functional form of H is unaltered when Pij is applied, regardless 
of its specific form, provided the particles are identical, hence this equation 
reads 

HPi?P = EP^ 

In other words, Pi$ is also an eigenfunction of H , and one belonging to the 
same eigenvalue E. 

Now Pij is an element of the symmetric group on n particles. There- 
fore the state of affairs described above is usually expressed by saying that 
the Schrodinger equation is invariant under the symmetric group. Permu- 
tations are not the only operations under which the wave equation is invari- 
ant. Suppose the nucleus of an atom is considered as a fixed field of force; 
then rotations and reflections at this point leave the energy of the system 
unchanged (i.e., the operator H is invariant with respect to them). The 
groups in question are those of sec. 15.15. If the atom is in a homo- 
geneous electric or magnetic field, the appropriate group is the subgroup 
of rotations about a fixed axis (see sec. 15.16). For a diatomic molecule, 
the two nuclei are the centers of force (as a first approximation) and the 
groups are those of rotation about, and reflection in a plane through, the 
line joining the nuclei. If the nuclei are identical (as in hydrogen, oxy- 
gen or nitrogen) reflections in a plane perpendicular to the internuclear 
line (i.e., exchange of the nuclei) must also be included. For a polyatomic 
molecule, the potential energy has the same symmetry as the molecule 
itself, hence the wave equation is invariant to some one of the crystallo- 

27 Such usage has been discussed in detail, especially, by Wigner and Weyl in refer- 
ences cited at the end of this chapter. Many of the books listed in Chapter 11 on 
quantum mechanics, particularly that of Dirac, avoid the formalism of group theory but 
obtain equivalent results by a more physical procedure. 



583 


APPLICATIONS OF GROUP THEORY 


16.19 


graphic groups. These examples should be sufficient to indicate the kinds 
of groups which are of importance in quantum mechanical problems. Each 
case must be studied individually and all groups under which the particular 
Schrodinger equation to be studied is invariant must be taken together to 
form the complete group of the Schrodinger equation. 

As a simple example of the method, consider the one-dimensional wave 
equation 28 

{- + 7 (*)}*(*) = Wf(x) (15-69) 

where the potential energy is of such a form that 

V(x) =» V{-x) 

and where the energy state W is non-degenerate, as is nearly always true in 
such one-dimensional problems. Suppose Pj is an operation which 
replaces x by —x wherever it occurs in (69). Then 

PpK*) = = iK-z) 

the result being a new ^-function, yp f , which has the same value at z as the 
old one, tp, had at —x. The new ^-function will, however, satisfy the wave 
equation as well as the old one, with the same value of W. Hence it must 
be a constant multiple of yp(x) f i.e., yp' = of/., But if both yp and \p / are to be 
normalized, c can only be +1 or —1. This result recalls the well-known 
fact that all eigenfunctions of eq. (69) are either even or odd functions of x . 

To exhibit the connection with group theory we note here the following 
facts which will be illuminated subsequently. Let P E be an operator that 
replaces x by itself. Then 

PeH*) = M*) 

Combining P$ with Pj we obtain a group, 

PiPe = P*Pi = Pi) Pi = Pb 

which is isomorphous with C 4 (sec. 15.18), and others mentioned in pre- 
ceding sections. It has two irreducible representations both of degree one 
(see Table 8). The representation A g corresponds to even eigenfunctions 
and A* to odd ones. 

Next, let us suppose that the Hamiltonian operator is invariant to a 
group of linear substitutions, such as R } S, etc., and that the ^-function 
depends on n coordinates x\ • • • x n . These may be combined to form a 
vector x. If, then, 

x' = Rz 

28 We now use W for the eigenvalue in this section, reserving E for the unit element 
of a group. 



15.19 


GROUP THEORY 


584 


we may define an operator P R which changes ^(x) into ^(x r ): 

P R yp(x) = \p (x') 

Now consider two cases: 

a. ^(x) is non-degenerate. Since, from the invariance of H , 
must also satisfy the Schrodinger equation with the same W , it must be 
identical with yp (except for a constant multipler).. 

b. yp(x) belongs to an eigenvalue W which possesses an a-fold degener- 
acy. We may then label the a linearly independent functions 

’/' i , fa, • • ta 

The effect of P R on yp will then be to convert it into a linear combination of 
ypi, for such a combination is the most general function belonging to W. 
Hence 

Pr^Px = 

k = i 

D(R) being a certain matrix associated with the operator P R . Similarly, 

Pstk = y h'PjE>(S)j k 

; *= i 

and 

j\ A: 

- »i[C(S)i)(«)k ’ (15-70) 

j 

From sec. 15.7, it should be clear that the matrices whose elements appear 
on the right of eq. (70) are representations of the group of the operators 
P R , P s . The dimension of each representation equals the number of 
linearly independent \p4 unctions, hence it is also equal to the degeneracy 
of the eigenvalue. If the original set of ^-functions were not linearly inde- 
pendent the resulting representations would be reducible. When the com- 
plete set of irreducible representations is obtained we see that each one 
would correspond to an eigenvalue of the quantum mechanical problem. 
The value of group theory in quantum mechanics is thus evident. From 
the symmetry of the system and without solving the wave equation we 
may obtain the possible number of eigenvalues and the degeneracy of each. 
Moreover the eigenvalues may be classified with regard to the particular 
representation to which it belongs. This fact is of considerable interest 
to the spectroscopist in studying the possible number and the symmetry 
of the energy levels to be expected in a given case. For example, as indi- 
cated in an earlier paragraph of this section, the group for the diatomic 



585 


APPLICATIONS OF GROUP THEORY 


15.19 


molecule is R ± (2) of sec. 15.16. This has an infinite number of represen- 
tations ra = 1, 2, 3, • • • and two representations for m = 0. These corre- 
spond to the electronic energy levels 29 II, A, $, • • • for m = 1, 2, 3, • • • 
and IT for m - 0. 

The selection rules for allowed transitions in atomic and molecular 
spectroscopy may be determined readily from the symmetry alone. As 
shown in sec. 11.28 these depend on the matrix elements of the electric 
moment. The latter is itself a vector and its components will transform 
under the operations of the group like y , z or some combination of these 
components as shown in Table 8 for the various symmetry groups. The 
^-function of a given state will also belong to some irreducible representa- 
tion of the group. The product of a component of the electric moment and 
the ^-function will transform like the direct product of the representations 
for each. This direct product will often be reducible and when reduction 
is effected, the result will be a sum of representations of the symmetry 
group. Transitions are allowed only to these states. Actually it is not 
necessary to know the representations themselves as a knowledge of the 
characters alone is sufficient. The reader should refer to other sources 30 
for the details of the theory. A simple example will show how the method 
is used. 

Suppose a given energy level is known to have a ^-function which 
transforms like E 2 in the group De. Then for an electric moment along z , 
the direct product of the characters of A 2 and E 2 is 2, 2, — 1, — 1, 0, 0, 
hence the only allowed transition from E 2 is to another state of the same 
symmetry. If the component of electric moment (x ± iy) is of interest, 
the characters are those of E\ times E 2 or 4, —4, 1 , — 1 , 0, 0 which is a sum 
of characters for B 2 and Transitions are allowed from E 2 to either 
Bi , B 2 or Ei but to no others for the ( x db iy) component of electric moment. 

Selection rules for the Raman effect depend in a similar way on the 
transformation properties of the polarizability tensor. Its characters are 
2 ifc 2 cos 0 + 2 cos 2 4 >, the upper sign referring to a proper rotation and 
the lower sign to an improper one. 

As shown in sec. 9.10, the instantaneous position in space of a poly- 
atomic molecule containing n atoms is specified by Zn coordinates. Three 
of them locate the center of gravity of the molecule and are thus associated 
with translational motion. Three more (or two, if the molecule is a linear 
one) describe orientation relative to principal axes of inertia and the motion 

29 These are the customary symbols in molecular spectroscopy; see, for example, 
Herzberg, G., “ Molecular Spectra and Molecular Structure; Diatomic Molecules, n 
D. Van Nostrand Co., Inc., New York, 1950. 

30 See, for example, Eyring, Walter, and Kimball, loc. cit. or Meister, A. G., Cleve- 
land, F. F., and Murray, M.* J., Am . J. Phys . 11, 239 (1943*). 



GROUP THEORY 


58G 


is rotation. The remaining (3n — 6) or (3 n — 5) coordinates are de- 
scriptive of internal motions or vibrations. Now the latter, as well as the 
three translations, transform like vectors and the activity of the vibration 
in the infrared or the Raman effect may be determined as we have indicated. 
The transformation properties of rotation are like that qf angular momen- 
tum and from eq. (9-19) it may be shown that the reducible characters are 
1 rfc 2 cos 0, the upper sign again referring to proper rotations and the 
lower sign to improper ones. Use of these transformation properties makes 
it possible to predict in considerable detail the spectroscopic behavior of 
the polyatomic molecule, provided its symmetry is known or a reasonable 
one assumed. 

REFERENCES 

General treatments of group theory: 

Burnside, W., “ The Theory of Groups,” Cambridge University Press, 1927. 
Kowalewski, G., “ Einfuhrung in die Theorie der Kontinuierlichen Groppen,” Chelsea 
Publishing Co., New York. 

Kurosh, A., “ Group Theory,” Second Edition, Chelsea Publishing Co., New York, 1954. 
Ledermann, W., “ Introduction to the Theory of Finite Groups,” Second Edition, Inter- 
science Publishers, Inc., New York, 1953. 

Littlewood, D. E., “ The Theory of Group Characters,” Oxford University Press, New 
York, 1940. 

Murnaghan, F. D., “ The Theory of Group Representations,” The Johns Hopkins Press, 
Baltimore, 1938. 

Weyl, H., “ The Classical Groups,” Princeton University Press,. 1939. 

Zassenhaus, H., “ The Theory of Groups,” Chelsea Publishing Co., New York, 1949. 

Treatises on quantum mechanics, which use the methods of group theory: 

Bauer, H., “Introduction a la Theorie des Groupes,” Les Presses Universitaires de 
France, Paris, 1933. 

Casimir, H. B. G., “ Rotation of a Rigid Body in Quantum Mechanics,” J. B. Wolters, 
Groningen, 1941. 

Corson, E. M., “ Perturbation Methods in the Quantum Mechanics of n-Electron 
Systems,” Second Edition, Hafner Publishing Co., New York, 1953. 

Corson, E. M., “ Introduction to Tensors, Spinors, and Relativistic Wave-Equations,” 
Hafner Publishing Co., New York, 1953. 

van der Waerden, B. L., “ Die Gruppen Theoretische Methode in der Quantenme- 
chanik,” J. Springer, Berlin, 1932; Edwards Brothers, Inc., Ann Arbor. 

Weyl, H., “ Theory of Groups and Quantum Mechanics,” Methuen and Co., Ltd., 
London, 1931. 

Wigner, E. P., “ Gruppen Theorie und Ihre Anwendung auf die Quantenmechanik der 
Atomspektren,” Vieweg, Braunschiverg, 1931; Edwards Brothers, Inc., Ann Arbor, 

Application of group theory to crystal structure: 

Burckhardt, J. J., “ Die Bewegungsgruppen der Kristallographie,” Birkhauser, Basel, 
1947. 

Phillips, F. C. f “ Crystallography,” Longmans, Green and Co., Inc., New York, 1947. 
Schoenflies, A., “ Theorie der Kristallstruktur,” Gebriider Borntraeger, Berlin, 1932. 
Zachariasen, W. H., “ Theory of X-ray Diffraction in Crystals,” John Wiley and Sons, 
Inc., New York, 1945. 



INDEX 


Abelian group, 546 
Abel’s integral equation, 541 
Absolute temperature, 30 
velocity, 289 

Accessory conditions, 209 
Adams, E. P., 177 
Adams, N. Jr., 140 
Addition of matrices, 306 
of vectors, 140 

theorem for Legendre polynomials, 
109 

Adiabatic expansion, 36 
Adjacent path, 198 
Adjoint matrix, 309 
Aggregate, probability, 435 
Aitken , 316, 322, 332 
Algebraic calculations, 491 
Allen, 519 

Alternating group, 558, 561 
Amplitudes, probability, 347 
Analogies between thermodynamic and 
statistical quantities, 459 
Analogues statistical, of thermody- 
namic quantities, 452 
Analysis, indirect chemical, 314 
Anchor rings, 190 
Angles, Eulerian, 282, 286 
Angular momentum, 285 
eigenfunctions of, 360 
eigenvalues of, 360 
in quantum mechanics, 338 
internal, 293 
velocity, 145, 285 

Antisymmetric eigenfunctions, 416, 455 
Approximate quadrature, 474 
Approximation in the mean of func- 
tions, 276, 279 

Approximation method, for algebraic 
equations, 493 

for differential equations, 484 
for secular determinants, 503 
Arbitrary constants, 33 
Areas, vector, 144 


Arley, 519 

Arrangements, 431, 432 
Arrays of numbers, 301 
Assemblage of identical particles, 417 
Assignment of statistical weights, 

456 . • 

Associate matrix, 309 
Associated Laguerre function, 80 
Laguerre polynomial, 78, 106, 132 
Legendre functions, 68 

differential equation for, 68 
representation, 560 
spherical harmonics, 68, 69 

differential equation for, 68, 69, 
223 

tensor, 167 
Associative law, 545 
Atmospheric pressure, 36 
Atom in a magnetic field, 408 
Auxiliary equation, 49 
Average error, 510 
weighted, 514 
Axes, coordinate, 172 
Axial vector, 165 

Axiomatic foundation of quantum 
mechanics, 335 
Axis of rotation, 285 
of symmetry, n-fold, 575 
Azimuth, 177 

Bacteria, 33 
Baggott , 483 
Balls in boxes, 438 
Barrier problems, 353 
Base vectors, 193 

Bateman , 245 ' 

Bauer , 566, 586 
Beattie , 477 
Beers , 519 
Bent , 17 

Bernoulli’s equation, 44 
numbers, 474 
Bessel coefficients, 113 
587 



588 


INDEX 


Bessel coefficients, functions, 75, 113, 132 
formulas involving, 121 
Bessel's differential equation, 74, 221, 
235, 540 
integral, 116 

interpolation formula, 470 
B-function, 97 
Bilinear form, 317 
Binomial coefficients, 433 
. addition theorem of, 434 
theorems on, 434, 435 
Binomial expansion, 433 
theorem, 113 
Biol , 252 

Bipolar coordinates, 187 
Birge , 492, 504, 518, 519 
Bliss , 215 

Bloch’s theorem, 81 
Bocher , 72, 313, 323, 543 
Body, deformable, 169 
elastic, 169 
rigid, 282 
Bokm, 429 

Bohr’s frequency condition, 402 
radius, 131 

theory, angular momentum, 340 
Bolza, 215 

Born, 27, 87, 328, 429 
Bose-Einstein statistics, 462 
Boundary conditions of differential 
equations, 269, 520, 537 
Brachistochrone, 202 
Bridgman , 15, 17 
Brillouin , 171, 429 
Browne, 11 
Brunt , 518 
Buck, 519 
Buchdahl , 27 
Burckhardt , 586 
Burington , 59, 156 
Burnside, 586 
Byerly , 215, 249 

Cable, hanging under own weight, 58 
Calculation, algebraic, 491 
Cambi, 136 
Campbell, 252 

Canonical ensemble, 446, 448 
equations, Hamilton’s, 284, 443 
Canonically conjugate 
variables, 284 


Capacitance, 44 
Caratheodory, 13, 27 
principle of, 26 
Carroll, 17 
Carslaw, 266, 267 
Cartesian components, 101 
Cartesian, coordinates, 172, 177, 338 
expansion for divergence, 150 
for scalar product, 142 
for scalar triple product, 147 
for vector product, 143 
system, 553, 576 
Casimir, 586 
Catenary, 58 
Cauchy relations, 41, 89 
Cauchy’s integral theorem, 259 
Cauchy-Riemann, 90 
Cayley-Klein parameters, 287 
Central difference formulas, 467 
Central field motion, quantum treat- 
ment, 363 
Centrifuge, 38 
Chain, sliding over peg, 51 
Chapman, 431, 466 
Characteristic equation, 7 ' 

of a matrix, 318 
.functions, 246 
roots of a matrix, 319 
values, 246 

Characters of a representation, 554, 578 
tables of, 579 

Charged cylinder, potential near, 225 
Chasles’ Theorem, 275 
Chemical analysis, indirect, 314 
Chemical potentials, 25 
Christo ffel three-index symbol, 167, 
196 

Churchill , 136, 267 

Circular membrane, vibrations of, 254 
Clairaut’s equation, 47, 48 
Clapeyron’s equation, 38 
Classes, 432, 547 

Classical physics contrasted with quan- 
tum mechanics, 333 
Classification of eigenvalues according 
to irreducible representations, 
584 

Clausius, 11 , 24, 38 
Cleveland, 585 
Closed systems, 394 
Coefficient of diffusion, 238 



INDEX 


589 


Coefficients, detached, 498 
Cofactor of determinant, 303 
Cogradient variable, 318 
Colatitude, 177 

Collar , 282, 332, 498, 499, 502, 503 
Collatz, 519 

Collineatory transformation, 317 
Combinations, 431 
of three vectors, 146 
with repetitions, 433 
Commutability of operators, 337, 338 
Compatible measurements, 338 
Complementary function, 53 
Complete differential, 82 « 
solution, 32 

Completeness of a set of functions, 248, 
249 

eigenfunctions, 344 
Complex integration, 89 
of a group, 548 
Components of a tensor, 162 
Components, of thermodynamic sys- 
tem, 26 

of a curvilinear vector, 174 
of a vector, 137 

Composite functions of del, 153 
Conditions on state functions, 336 
Condon, 358, 422, 429 
Conducting sphere, in field of point 
charge, 226 
potential near, 224 
Conductivity, thermal, 35 
Configuration space, 336 
Confocal ellipsoidal coordinates, 178 
paraboloidal coordinates, 184 
Congruent transformation, 317, 322 
Conical coordinates, 183 
Conjugate variables, 284 
elements of a group, 547 
subgroup, 548 

Conjunctive transformation, 317 
Conservation of density-in-phase, 444 
Conservative system, 205, 281 
Constraints, 283 

Continuity, equation of, 152, 217, 
399 

Continuous group, 562 
Continuous spectrum, 402 
of eigenvalues, 336, 340 
Contraction of tensors, 166 
Contragradient variable, 317 


Contra variant tensor, 163 
vector, 162 
Convolution, 261 
Coolidge , 424, 503 
Coordinate axes, 172 
line, 172, 192 
surface, 172 

Coordinate systems, orthogonal, 173 
non-orthogonal, 192 
n-dimensional, 312 
Coordinates, bipolar, 187 
Cartesian, 172, 177 
confocal ellipsoidal, 178 
confocal paraboloidal, 184 
conical, 183 
curvilinear, 172 
cylindrical, 178, 191 
ellipsoidal, 179 
elliptic cylindrical, 182 
generalized, 443 
normal, 292, 326 
oblate spheroidal, 182 
of Lagrange, generalized, 283 
Coordinates, parabolic, 185 
parabolic cylindrical, 186 
prolate spheroidal, ISO 
relative, 411 

spherical polar, 177, 191 
tensors in curvilinear, 192 
toroidal, 190 
Corbin , 282 
Corson , 430, 586 
Coset, 548 

Cosines, direction, 138, 173 
Cotes, formula of Newton and, 476 
Coulomb field, motion in, 365 
Coulson , 245, 430 

Courant, 267, 270, 279, 281, 379, 528, 
543 

Co variant derivative of tensor, 169, 196 
Co variant tensor, 163 
vector, 163 
Cowling, 431, 466 
Craig , 171 
Cramer’s rule, 313 
Crawford , 503 
Cross, 300, 503 
Cross product, 143 
Grumpier , 504 

Crystallographic point groups, 574 
Cubic groups, 575 





590 INDEX 


Curl in curvilinear coordinates, 175 
in tensor notation, 197 
of a vector, 152 

Current in quantum mechanics, 399 
Curve fitting, errors in, 518 
Curvilinear component of vector, 174 
Curvilinear coordinates, 172 
tensors in, 192 
Cycle, 

of permutation, 558 
thermodynamic, 8 
Cyclic group, 543, 557 
Cycloid, 203 

Cylinder, potential of charged, 225 
Cylindrical coordinates, 178, 191 
elliptic, 182 
parabolic, 186 

Damped harmonic motion, 51 
Darling , 299 
Darwin , 452, 459 
Darwin-Fowler method, 452 
Davey , 576 
De Broglie , 429 
formula, 398 
wave length, 352 
Decius , 300 
DeFay, 31 
Definite form, 317 
Deformable body, 169 
Degeneracy, 296 
factor, 458 

due to particle exchange, 415 
Degenerate eigenvalues, 273 
Degree, of a cycle, 454 
of a differential equation, 32 
of a group representation, 550 
Degrees of freedom, 26, 282 
Del, 150 

composite functions of, 153 
successive applications of, 153 
Delta, Kronecker, 163, 308 
Deming , 519 

De Moivre’s theorem, 557 
Density, flux, 152 
Density-in-phase, 444 
Derivative, co variant, 196 
directional, 150, 174 
of tensor, 167 
of tensor, covariant, 169 
partial, 2 


Derivative, thermodynamic, 15 
Deviation, standard, 511 
Descents, method of steepest, 459 
Desch , 26 

Determinant, cofactor of, 
complementary minor, 303 
definition of, 302 
differentiation of, 305 
expansion of secular, 500 
Gram, 135 

Laplace development of, 311 
multiplication of, 303 
numerical evaluation of, 499 
numerical solution of secular, 500 
properties of, 311 
roots of a secular, 500 
solution of simultaneous equations 
' by, 497 
value of, 302 
Wronskian, 135 
Diagonal matrix, 298 
Diagonalization of a matrix, 319, 331 
Diatomic molecule, 360, 582 
Difference, definition of, 468 
divided, 470 
formulas, central, 470 
of tensors, 164 
of vectors, 141 
table, 468 

Differential, complete, 82 
Differential, exact, 1, 8, 82, 156 
higher order, 5 
incomplete, 82 
inexact, 82 

Differential and integral equations, 
relation of, 532 

Differential equation, of Thomas- 
Fermi, 491 

numerical solution of, 482 
partial, 216 

Differential operator, 267 
in tensor notation, 195 
Differentiation, 261 

Differentiation, by polynomial method, 
473 

numerical, 472 
of determinants, 305 
of tensor, 167 
of vectors, 148 
partial, 1 

with interpolation formula, 472 



xatmmmm&m&mttBGmsmm *w»» 


INDEX 591 


Diffraction of waves, 234 
Diffusion, differential equation of, 237 
quantum mechanical, 397 
Dimension of a group representation, 
550 

Dingle , 335 

Dipole, potential due to, 228 
Dipole moment, 56, 101 
waves, 236 

Dirac, 254, 341, 429, 582 
Dirac 6-function, 239, 341 
Direct product of groups, 556 
sum of representations, 551 
Direction cosines, 138, 173 
Directional derivative, 150, 174 
Dirichlet integral, 253 
Discontinuous potentials, 354 
Discrete group, 562 

Discriminants of a quadratic form, 323 
Dispersion, of a function, 436, 437 
of a statistical aggregate, 349 
Displacement, electric, 152 
operators for spins, 407 
Distribution law, Gauss, 504 
quantum mechanical, 453 
Distribution of probability, 436 
Divergence, 151 
in tensor notation, 196 
theorem of, 159 
Divided differences, 470 
Divisor, normal, 548 
Doetsch, 266 
Dot product, 141 
Double volume integrals, 382 
two-center problem, 424 
Dummy index, 162 
Duncan , 282, 332, 498, 499, 503 
Dushman, 429 
Dwyer , 519 
Dyadic, 163 
Dynamics, 171 

Echart , 291, 358, 413, 565, 566 
Eddington , 171 
Eigendifferentials, 342 
Eigenfunction, 246 
completeness of, 276 
of integral equations, 527 
Eigenstates, simultaneous, 348 
Eigenvalue, 246 
degenerate, 273 


Eigenvalue, of a kernel, 527 
of a matrix, 319 
Eigenvectors, 319, 405 
Einstein-Bose, statistics, 456 
Eisenhart, 176 
Elastic body, 169 
Electric displacement, 152 
flux, 151 
polarization, 161 
Electricity, 171, 190 
Electrodynamics, 180 
Electromotive force, 42 
Electron, 418 
spin of, 402 

Element, conjugate, 547 
in probability theory, 435 
line, 173, 192 
of a group, 545 
surface, 173, 192 
volume, 173 
Ellipsoid, ovary, 181 
planetary, 182 
Ellipsoidal coordinates, 178 
Elliptic, cylindrical coordinates, 182 
function, 59, 180 
integral, 180 
Emde , 98, 120, 254 
Emission, radioactive, 439 
Empirical formula, 516 
error in, 518 

Energy, internal, 11, 451 
kinetic, 283 
potential, 176 
shell ensemble, 446 
Ensemble, 444 
canonical, 446 
Gibbsian, 442 
microcanonical, 446 
Enthalpy, 13 
Entropy, 12, 451, 464 
Envelope, 48 
Epstein, 1 , 186 
Equation, homogeneous, 313 
inhomogeneous, 313 
integral, 520 
linear, 31 3 

of a matrix, characteristic, 319 
of continuity, 152 
of state, 7 

of Sturm-Liouville, 534 
solution of integral, 521 



592 


INDEX 


Equations, numerical solution of, dif- 
ferential, 482, 489, 490 
simultaneous, 493, 497 
transcendental, 491 
Equations of motion, Hamilton’s, 
284 

Lagrange's, 283 
Newton’s, 282 

Equilibrium, heterogeneous, 26 
thermodynamic, 14 

Equivalence of linear operators and 
matrices, 374 
Equivalent matrices, 316 
Ergodic hypothesis, 443 
Error, average, 510 
determinant, 504 
function, 505 
in empirical formulas, 518 
of a function, probable, 515 . 
probability of, 505 
probable, 510 
random, 504 
root mean square, 510 
Essential observability, criterion of, 
334 

Euler, angles, 282, 286, 368, 566 
definition of gamma function, 94 
equation, 200, 270 
integral, 97 

Maclaurin formula, 474 
Mascheroni constant, 96, 427 
method for differential equations, 
485 

Euler-Rodrigues parameters, 287 
Even and odd functions, 103, 583 
Exact differential, 1, 8, 27, 156 
Exact differential, equation, 41 
Exchange, degeneracy, 415 
integral, 387, 428 
forces, 387 

Exclusion principle, 411, 415 
Expansion, adiabatic, 36 
coefficients, 374 1 

Expected mean, in Gibbs phase space, 
445 

of a sequence of observation, 342 
Explicit function, 2 
Exponential integral, 427 
Extensive variable, 1 
Extremal, 200 
Eyring, 430, 576, 585 


Factor group, 549 
Faltung , 261, 262 
Feller, 245 
Fermi, 491 

Fermi- Dirac statistics, 455 ff., 462 
Feshbach, 171, 176, 215, 226, 281, 342, 
521, 526, 543 
Field, scalar, 149 
vector, 149 
Field strength, 161 
Figures, significant, 467 
Findlay , 26 
Finkelnburg , 430 
First order, perturbation, 389 
simultaneous differential equations, 
numerical solution of, 489 
Floquet’s theorem, 80 
Flow of fluid, 151 
heat, 34, 160 
particles, 399 
Fliigge, 430 
Fluid, flow, 151 
incompressible, 224 
Flux, density, 152 
electrical, 151 
thermal, 151 

Forbidden transition, 402 
Force, 142 

acting on particle, 282 
moment of, 145 
Form, bilinear, 317 

discriminants of a quadratic, 323 
Hermitian, 329 
positive definite. 317 
quadratic, 317 
semi-definite, 317 

Formula, Bessel’s interpolation, 470 
central difference, 470 
empirical, 516 
interpolation, 468, 469 
Lagrange’s interpolation, 470 
Stirling’s interpolation, 470 
Forsythe, 67, 75, 136, 215 
Foster, 252 
Fourier, 525 
Fourier analysis, 247 
of odd and even functions, 253 
Fourier-Bessel, expansion, 257 
integral, 256 
transformation, 235, 240 
transforms, 254, 256 



ftMm'Vr. 


v«SJi»£ii7aai 


INDEX 593 


Fourier integral, 252 
series, 111 
theorem, 253 

Fourier transformation, 239, 259, 263 
Fourier transforms, 252 
Fowler, 452, 459, 465 
Frank, 244, 267 

Frazer, 282, 332, 498, 499, 502, 503 
Fredholm’s integral equation, 520 
method of solution, 526 
Free energy, Gibbs, 14 
Helmholtz, 13, 451, 464 
Free particle, 282, 396 
Freedom, degrees of, 282 
Frenkel, 429 

Frequency, relative, 435 
Frequency condition, Bohr’s, 402 
Fuchs’ theorem, 69 
Full linear group (FLG), 562 
Function, elliptic, 180 \ 

even, 103, 583 
implicit, 6 
odd, 583 
potential, 283 
probable error of, 515 
scalar point, 149, 174 
Functional determinants, 17 

Gamma function. 75, 93 
space, 443 
Gas space, 443 

Gauss, differential equation, 72 
error function, 397, 437, 439, 504 
method, for numerical integration, 
479 

theorem, 159 
General solution, 32 
Generalized coordinates, 283, 443 
momentum, 283 
Generating functions, 133 
Geodesics, 200 

Gibbs, 1 , 13, 24, 141, 163, 193, 442, 443 
ensembles, 442 
Gibbs-Wilson, 171 
Glasstone, 31, 430, 465, 478, 492 
Goldstein, 165, 282, 287 
Gornbas, 491 
Goranson, 17 
Gordy, 300 
Goudsmit, 403 • 

Gradient, 150 


Gradient, in tensor notation, 195, 196 
Graeffe’s method, for roots of a poly- 
nomial, 494 

solution of secular determinants, 500 
Gram determinant, 134, 312 
Gray, 120, 136 
Green’s formula, 536 
Green’s function, 534 
examples of, 539 
modified, 537 

Green’s theorems, 161, 242 
Gregory’s formula, 476 
Group, Abelian, 546 
alternating, 558, 561 
continuous, 562 
crystallographic, 574 
cubic, 575 
cyclic, 546, 557 
definition of, 545 
dihedral, 572, 573 
discrete, 562 
factor, 559 
full linear, 562 
full, real orthogonal, 569 
mixed continuous, 562 
octahedral, 575 
point, 574 
quotient, 559 
rotary reflection, 570 
rotation, 565 

special unitary (SUG), 563, 567 
symmetric, 558, 561 
tetrahedral, 575 
unimodular unitary, 563 
unitary, 562 
velocity, 398 

Group character, tables of, 579 
Group theory, applications of, 581 
Growth, organic, 33 
Guggenheim , 31 

Hamel, 543 

Hamilton’s canonical equations, 284, 
443 

principle, 204 
Hamiltonian function, 284 
operator, 213, 341 

Hamiltonian function, operator, time 
dependent, 396 
quantum mechanical, 299 
Hankel function, 76, 118, 525 



594 


INDEX 


Harmonic function, 220 
motion, 50 
Heat capacity, 11, 12 
conduction, differential equation of, 
237 

content, 13 
flow, 34, 160 
linear, 238 

two and three dimensions, 240 
Heine’s formula, 109, 110 
Heisenberg y 420, 429 
matrix theory, 371 
uncertainty principle, 348 
Heitler-London method, 424 
Helium atom, normal state of, 380 
excited states, 418 
ionized, 365 
Hellingery 543 
Helmholtz’ equation, 39, 42 
free energy, 451 
function, 69 

Hermite differential equation, 76, 121 
functions, 121, 124, 359 
polynomial, 76, 121 
Hermitian form, 329 
conjugate, 310 
matrix, 310, 329 
operator, 269, 344, 374 
scalar product, 329 
vector space, 328 

Herzbergy 296, 297, 299, 300, 496, 576, 
- 585 

Heterogeneous equilibrium, 26 
Hicks f 500 

High eigenvalues, distribution of, 274 
Hilbert, 267, 270, 277, 279, 281, 528, 543 
Hilbert-Schmidt method for integral 
equations, 528 
Hobson , 136, 172, 191, 480 
Homogeneous, meaning of term, 45 
Homogeneous equations, 313 
gas reactions, 36 
integral equation, 527 
polynomial, 317 
Homo-polar binding, 424 
Horny 543 

Horner’s method for roots of a poly- 
nomial, 494 
Householder, 519 
Houston , 429 
Howard, 299 


Hughes, 413 . 

Hydrodynamics, 171, 180, 190 
Hydrogen atom, 129, 131 

quantum mechanical treatment, 363 
Hydrogen molecular ion, 385 
Hydrogen molecule, 424 
Hyperbolic functions, 187 
Hypergeometric equation, 72, 370 
series, 72 

Ideal gas, 24 
ensemble for, 442 
Image, electrical, 228 
Implicit function, 6 
Improper rotation, 325, 569 
Ince, 88 

Independent systems, quantum me- 
chanics of, 414 
Index, dummy, 162 
of subgroup, 548 
precision, 441, 505 
umbral, 162 
Indicial equation, 61 
Indistinguishable objects, arrange- 
ments of, 432 
Inductance, 42 
Inertia, moment of, 286 
product of, 286 
Infinite potential hole, 352 
Inhomogeneous differential equation, 
242 

equations, 313 
integral equation, 526 
Inner product of vectors, 141 
tensors, 166 
Integral, elliptic, 180 
line, 155 
surface, 156 
volume, 156 

Integral and differential equations, re- 
lation of, 532 

Integral equation, Abel’s, 541 
definition of, 520 
eigenfunctions of, 527 
Fredholm’s, 520 
homogeneous, 527 
inhomogeneous, 526 
kernel of, 520 
linear, 520 
non-linear, 520 
of the third kind, 520 



amm*u 


INDEX 


Integral equation, resolvent of, 522 
Schmidt-Hilbert method for, 528 
solution of, 521 

Integral equation, summary of methods 
of solution, 532 
use of, 532 
Volterra’s, 520 

Integrating denominator, 28, 29, 84 
factor, 41 
Integration, 261 
numerical, 473 
vector, 154 
Intensive variable, 1 
Internal energy, 11 , 451 
Interpolation, inverse, 471 
two-way, 471 

Interpolation for equal values of the 
argument, 467 

unequal values of the argument, 470 
Interpolation formula, 468 
Bessel's, 470 
differentiation, 472 
Lagrange’s, 470 
Newton’s, 469 
Stirling’s, 470 

Invariance of wave equation, 582 
Invariant, 163 
subgroup, 548 

Inverse of a group element, 545 
interpolation, 471 
Inversion, 569 

Irreducible representations and eigen- 
values, 582 
Isomorphism, 549 
Isoperimetric problems, 209 
Isothermal process, 12 
Isotope effect, 413 
Iterated kernels, 522 
Iteration method for algebraic equa- 
tions, 493 

differential equations, 484 
solution of secular determinant, 503 

Jacobi polynomial, 74, 370 
Jacobians, 17, 18 
elliptic functions, 180 
Jaeger , 266, 267 
Jahnke, 98, 120, 254 
James, 424, 503 
Jeans , 190 

Jeffreys , B. S 281, 498 


595 

Jeffreys , H., 267, 281, 498, 519 
Jensen, 352 
Jordan, 328, 429 
Joule-Thomson coefficient, 23 

Kamke, 88, 120 
Kellogg, 180 
Kemble, 279, 429 
Kepler’s law, 206 
Kernel, eigenvalues of, 527 
infinite, 524 
iterated, 522 

of an integral equation, 520 
symmetric, 528 
Kimball, 430, 576, 585 
Kinetic energy, 283 
Klein and Cayley parameters, 566 
Kloiz, 31 
Kneser, 215, 544 
Kohn , 503 
Kolmogorov, 519 
Kowalewski, 332, 543, 544, 586 
Kron, 171 

Kronecker delta, 105, 124, 163, 308, 341 
Kurosh, 586 

Kutta-Eunge method for differential 
equations, 486 

Lagrange’s equations of motion, 283 
generalized coordinates, 283 
interpolation formula, 470 
method of undetermined multipliers, 
210 

Lagrangian equations, 206 
function, 205, 283 
multipliers, 209 

Laguerre differential equation, 77 
function, 126 
associated, 78, 129, 366 
polynomial, 77, 126, 132 
associated, 78, 128, 132 
Lanczos, 215 
Landi, 27, 429 
Language, classical, 335 
Laplace , 525 
pairs, 266 

Laplace’s equation, 161, 208, 237 
applications, 217, 224, 226 
in 2 dimensions, 218 
in 3 dimensions, 220 
Laplace Transformation, 259, 260 



596 


INDEX 


Laplacian, 153 

in cylindrical coordinates, 191 
in spherical polar coordinates, 191 
in tensor notation, 197 
Lass t 171 

Latent heat of change of pressure, 11 
Laurent series, 92 
theorem of residues, 92 
Laurent theorem 
residues, 91, 92 
Law, Gauss distribution, 504 
Newton’s, 191, 282 
Least action, principle of, 215 
Least squares, principle of, 506 
Ledermann , 586 
Legendre coefficient, 66 
differential equation, 61, 98, 540 
functions, associated, 234 
polynomials, 66, 98, 104, 105, 132, 
227, 270 

polynomial, roots of, 480 
Lehrman, 17 
Leibnitz , 202 
Lense , 21 6 
Lerman , 17 
Levy, 483 

Lindsay , 207, 273, 431, 439, 443, 466 
Line, coordinate, 172, 192 
element, 173, 192 
of force, 45 
integral, 1, 8, 155 
Linear dependence, 132 
equation, 313 

equations, numerical solution of 
simultaneous, 497 
independence of vectors, 311 
integral equation; 520 
momentum in quantum mechanics, 
338 

substitution operators, 260, 406 
transformation, 306, 314, 315 
variation functions, method of, 383 
vector space, 311 
velocity, 145, 285 
Liouville-Neumann series, 521 
Sturm equation, 534 
theorem, 444 
Lithium 

doubly ionized, 365 
Littlewood, 552, 554, 586 
London , 424 


Longitude, 177 
Lovitt, 522, 537, 544 

Macdougall , 1 
MacDujjee, 332 
Maclaurin expansion, 99, 446 
Maclaurin, Formula of Euler and, 474 
Macmillan, 180 
MacRobert , 136 
Magnetic field, 408 
Magnetic moment of electron, 403 
Magnus, 136, 177, 252, 267 
Many-body problem, 411 
Margenau , 119, 207, 273, 352 
Marschall, 430 
Mason, 180 
Massey , 430 
Mathews, 120, 136 
Mathieu, 296, 299 
Mathieu’s differential equation, 78 
Matrices, addition of, 306 
conformable, 306 
direct product of, 307 
equivalent, 316 
multiplication of, 306 
subtraction of, 306 
Matrix, associate, 310 
adjoint, 309 
characteristic, 318 
characteristic roots of, 319 
definition of, 305 
diagonal, 308, 371, 391 
diagonalization of, 319, 331 
eigenvalues of, 319 
eigenvectors of, 319 
Hermitian, 310, 329, 371 
mechanics, 371 

method of solution for secular deter- 
minants, 502 
non-singular, 562 
null, 307 
orthogonal, 310 
partition of, 307 
rank of, 306 
reciprocal, 309 
rectangular, 306 
singular, 305 
symmetric, 310 

symmetric and skew symmetric, 310 
trace of, 308 
transform of, 317 



INDEX 


597 


Matrix, transposed, 309 
unit, 308 
unitary, 310, 330 

Maxima in a tabulated function, 472 
Maximum, area, 210 
volume, 211 
Maxwell, 15, 185, 190 
Maxwell-Boltzmann, distribution law, 
449, 456 

Maxwell’s relations, 15 
Mayer , J. E., 431, 466 
Mayer, M. G ., 342, 431, 466 
McConnell, 171 
McLachlan, 136 
Mean in phase space, 445 
of a function, 436 
of aggregate of measurements, 336 
Measure of precision, 441 
Measurements, rejection of, 516 
weight of, 514 
Mechanical work, 142 
Mechanics, 180, 282 
statistical, 431 
Meister, 585 
Mellin, 526 

Mellin transformation, 259 
Mellor, 494 

Membrane, vibrating circular, 254 
Method, Gauss', 479 
Method of averages for empirical 
formulas, 516 
least squares, 517 

iteration for algebraic equations, 493 
Newton-Raphson for algebraic equa- 
tions, 492 

“regula falsi ” for algebraic equa- 
tions, 491 

Microcanonical ensemble, 446 
Microscopic state, 453 
Milne, E. A., 171 
Milne , W. E., 498, 499, 519 
Milne-Thomson, 180, 190 
Milne’s method for differential equa- 
tions, 489 

Minima in a tabulated function, 472 
Minimum value of integral, 198 
surface of revolution, 203 
Minor of determinant, 303 
Mixed-continuous group, 562 
Mixed tensor, 163 
Mbbius strip, 156 


Molecular spectroscopy, 585 
Molecule, diatomic, 582 
motion of, 290 
polyatomic, 282 
potential energy of a, 295 
quantum mechanical Hamiltonian 
of a, 299 

rotational motion of a, 292 
space, 443 

translational motion of a, 291 
vibrational energy of a, 294 
vibrational frequency of a, 296 
vibrational motion of a, 291 
Moment of a force, 145 
of aggregates of measurements, 346 
of a probability distribution, 437 
of inertia, 286 
of momentum, 285 
Moment theorem, 437 
Momentum, angular, 285 
generalized, 283 
moment of, 285 

Morse, 171, 176, 215, 256, 281, 342, 
429, 521, 526, 543 

Motion, Hamilton’s equations of, 284 
Lagrange’s equations of, 283 
Newton’s laws of, 191, 282 
of a molecule, 290 
Molt, 429, 430 
Muir, 302, 332 
Multiplication, 262 
multipole expansion, 100 
multipole moment, 101 
multipole strength, 101 
Multiplication of determinants, 304 
matrices, 306 

Multiplication of determinants, ten- 
sors, 165 

Multiplication table of a group, 546 

Murnaghan, 266, 543, 555, 564, 586 

Murphy, 574 

Murray, 585 

Muskhelishvili, 544 

Mu-space, • 443 

KBS Mathematical Tables Project, 136 
Nebengruppe, 548 
Negative kinetic energy, 358 
Neumann function, 76 
Liouville series, 521 
Neutrons, 418 


INDEX 


698 * 

Newton’s binomial expansion, 433 
equations of motion, 191, 282 
interpolation formula, 469 
probability distribution, 438 
Newton-Cotes formula, 476 
Newton-Raphson, method for algebraic 
equations, 492 
roots of a polynomial, 494 
Nielsen , H. H., 300 
Nielsen , N., 120, 136 
Non-orthogonal coordinate systems, 
192 

Non-singular matrices, 562 
Normal coordinates, 292, 326 
divisor, 548 
mode of vibration, 296 
Normalization of functions, 249 
Nucleus, atomic, 282 
of an integral equation, 520 
Numbers, Bernoulli, 474 
Numerical determination of roots of 
polynomial, 494 
differentiation, 472 
evaluation of determinants, 499 
integration, 473 
secular determinants, 500 
simultaneous equations, 493-497 
solution of differential equations, 
482 

transcendental equations, 491 

Oberhettinger, 136, 177, 252, 267 
Oblate spheroidal coordinates, 182 
Observability, essential, 334 
Observable, 335, 338 
Occupation numbers, 453 
Odd function, 103 
Operand, 337 

Operations composing crystallographic 
groups, 574 
Operator, 48, 336, 338 
Operator, commuting, 348 
equation, 337 
Hermitian, 269, 374 
in tensor notation, 195 
vector, 174 
Orbital, 422 

Order of a differential equation, 32 
group, 546 
group element, 546 
Ordinary differential equations, 32 


Orthogonal coordinate system, 173 
matrix, 310 

transformation, 317, 324 
Orthogonality of functions, 248 

quantum mechanical eigenfunctions. 

344 

Orthogonalization of vectors, 312 
Orthohelium, 423 
Orthonormal functions, 249 
Oscillation, forced, 54 
natural, 52 

Oscillator, an harmonic, 59 
by matrix mechanics, 371 
harmonic, 125, 207, 300 
quantum mechanical treatment, 358 
Outer product of vectors, 143 
tensors, 165 

Page , 56, 140, 210 
Parabolic coordinates, 185, 186 
Paraboloidal coordinates, 184 
Parameters, Cayley-Klein, 287 
Rodrigues, 287 
Parhelium, 423 
Parington, 31, 171 
Partial differentiation, 1 
differential equation, 216 
Particle, concept of, 334 
free, 282 
vs. wave, 335 

Particles, system of n free, 282 
restricted, 282 
Particular integral, 53 
solution, 33 

Partition function, 452, 465 
of a permutation, 559 
Paul, 31 
Pauli, 417 
principle, 411 
spin theory, 402 

Pauling, 177, 367, 382, 387, 429, 430 
Peirce , 59 

Periodicity as boundary condition, 273 
Perlis, 332 

Permutations, 431, 549 
even, 417 
odd, 417 

Perturbation theory, 387 
Pfaff differential equation, 28, 82 
Phase, 25 
integral, 452 



INDEX 


599 


Phase, rule, 25 
space, 442 
velocity, 228, 398 
Phillips , 141, 586 
Photon, 418 
Physical system, 335 
Picard method for differential equa- 
tions, 484 

Planck’s constant, 338 
Plane, potential due to charged, 225 
Plummer, 507, 513 
Point function, 8 
scalar, 149, 174 
Poisson’s equation, 237, 242 
formula, 441 

Polar coordinates, spherical, 177, 191 
vector, 165 

Polarizability, atomic, 392 
Polarization, electric, 56, 161 
Polyatomic molecule, 282 
Polygon, rotation of, 572 
Polynomial, complex roots of, 495 
homogeneous, 317 
method, differentiation by, 473 
for solution of secular deter- 
minants, 500 

numerical determination of roots, 
494 

roots of the Legendre, 480 
Postulates of quantum mechanics, 337 
Potential, chemical, 25 
electrostatic, 217, 224 
energy, 176, 283, 295 
theory, 180, 191 
thermodynamic, 14 
velocity, 217, 224 

Potential due to conducting sphere, 224 
charged cylinder, 225 
charged plane, 225 
Precision index, 441, 505 
measures of, 441, 510, 513 
Prigogine, 31 

Principal axis transformation, 326 
Principle of least squares, 506 
Probability, 435 
Probability, aggregate, 435 
amplitude, 347, 402 
density, 436 
of phase, 444 
theory, 435 

Probability distributions, 436 


Probability distributions, arithmetical, 
436 

continuous, 436 
discrete, 436 
geometrical, 436 
Probable errors, 510 
of a function, 515 
Product, Hermitian scalar, 329 
of inertia, 286 
Product of tensors 
inner, 166 
- outer, 165 
Product of vectors 
cross, 143 
dot, 141 
inner, 141 
outer, 143 
scalar, 141, 311 
scalar, triple, 146 
skew, 143 
three vectors, 146 
triple vector, 147 
vector, 142 
Projectile, 40 

Prolate spheroidal coordinates, 180 
Proper rotation, 325 
Property in probability theory, 435 
Protons, 418 

Quadratic form, 317 
discriminants of, 323 
Quadrature, approximate, 474 
formulas, general remarks, 481 
Quadrupole moment, 101 
Quadrupole, potential due to, 228 
Quantum dynamics, 337 

mechanics, general discussion, 
333 

number, total, 366 
statics, 337 
Quotient group, 549 

Radiation theory, 400 
Radioactive decay, 33, 43 
emission, 439 
Radius vector, 153 
Rainville , 88 
Raman effect, 585 
Random walk, 439, 442 
Randomness, criterion of, 435 
Rank of tensor, 162 


600 


INDEX 


Raphson-Newton method for algebraic 
equations, 492 
roots of a polynomial, 494 
Rate of solution, 35 
Rayleigh-Schrodinger perturbation 
formula, 389 

Reaction, bimolecular, 36 
consecutive, 40 
homogeneous, 36 
opposing, 40 
order of, 37 
rate, 37 

termolecular, 36, 40 
unimolecular, 36 
Real eigenvalues, 345 
Reciprocal matrix, 309 
parallelepiped, 352 
vectors, 193 
Recurrence formula, 72 
Reduced mass, 413 
Reducible representation, 550 
Reduction of group representations, 
552 

Reed, 24, 483 

Reflection coefficient of potential bar- 
rier, 356 
rotary, 325 

Regular points of a differential equa- 
tion, 71 

Relative coordinates, 411 
frequencies of measured values, 346 
frequency, 435 
velocity, 289 
Relativity, theory of, 171 
Representation, associated, 560 
of groups, 550 
irreducible, 550 
orthogonality of, 551 
reducible, 550 
self-associated, 560 
Representative point, 442 
Residuals and precision measures, 513 
Residue, 89 

Residues, theorem of, 89 
Resistance, 42 

Resolvent of an integral equation, 522 
Resonance catastrophe, 55 
Reversion of series, 471 
Rigid body, definition of, 284 
most general motion of, 285 
rotation of, 285 


Rigid body, translation of, 285 
Ritz method, 377, 379 
Robertson , 176 
Robinson, 495, 497 
Rodrigues’ formula, 102 
parameters, 287 
Rojanski, 429 

Root mean square error, 510 
Roots of Legendre polynomial, 480 
matrix, characteristic, 319 
polynomial, numerical determination 
” of, 494, 495 
secular determinant, 500 
Rope, suspended, 543 
Rosenthal, 574 
Rossini, 31 

Rotary reflection group, 570 
Rotation, 171 
axis of, 285 
of a rigid body, 285 
vector, 152 
group 

three-dimensional, 565 
two-dimensional, 570 
improper, 324, 325, 569 
proper, 324, 325 
Rotations as groups, 566 
Rotator 

quantum mechanical treatment, 360 
rigid, 300 
Ruark , 429 
Rule, Simpson’s, 477 
trapezoidal, 477 
Weddle’s, 478 

Runge-Kutta method for differential 
equations, 486 
Rushbrooke , 466 
Rutherford, 171 
Rutledge, 473 

Saddle point, 460 
Sayvetz, 291 
Scalar, 137, 163 
field, 149 
gradient of, 150 
point function, 149, 174 
Scalar product, 141, 311 
Hermitian, 329 
triple, 146 

Scarborough, 471, 495, 519 
Schiff, 429 





r r 






INDEX 601 


Schlaefli’s formula, 102, 103, 105 
Schmidt , 71 

Schmidt, orthogonalization method for 
functions, 273 
vectors, 312 

Schmidt-Hilbert method for integral 
equations, 528 

Schoenflies , 574, 586 ^ 

system of group notation, 576 
Schreier, 332 

Schrodinger , 186, 429, 466 
Schrodinger equation, 176, 213, 341 
and group theory, 582 
involving time, 393, 394 
of free mass point, 350 
Schwank , 521 

Schwarz’ inequality, 134, 348 
Second order differential equations, 48; 

numerical solution of, 490 
Second order perturbation, 389 
Secular determinant, 421, 500 
Seitz, 81, 430, 578 

Self-adjoint differential equation, 268 
operator, 267 

Self-associated representation, 560 
Separation of center-of-mass coordi- 
nates, 411 

of variables, method of, 176, 218, 
220, 231 

Series, integration, 59 
Liouville-Neumann, 521 
method for differential equations, 
483 

reversion of, 471 
Shaw, 18, 519 
Shearing strain, 170 
Sherwood , 24, 483 
Shortley , 422, 429 
Significant figures, 467 
Similarity transformation, 317, 318 
Simpson’s rule, 477 

Simultaneous differential equations, 
numerical solution of, 489 
eigenstates, 348 

equations, numerical solution of, 493, 
497 

Singlet states, 423 
Single-valuedness, 336 
Singular point of a differential equa- 
tion, 70 

solution, 33, 47 


Singularity, essential, 71 
Singularity, non-essential, 71 
Skew product, 143 
symmetric matrix, 310 
symmetric tensor, 164 
Slater , 17, 429 
Smith, 300 
Snapshot, 334 
Sneddon, 267, 429 
Soap film, 39 
Sokolnikoff, E. S 59 
Sokolnikoff , 1. S., 59 
Solution, rate of, 35 
singular, 47 

Solution of differential equations, 
numerical, 482 

of integral equations, 521, 532 
of simultaneous equations, numeri- 
cal, 493, 497 

of transcendental equations, numer- 
ical, 491 

Sommerfeld, 119, 245, 353, 398, 429 
Space, Hermitian vector, 328 
linear vector, 310 
Spain, 171 
Special functions 

Cauchy’s theorem, 90 
Cauchy-Riemann, 90 
Spectroscopy, molecular, 585 
Speiser, 552, 554, 555 
Sphere, moving through incompres- 
sible fluid, 224 
oscillating, 235 
Spherical harmonic, 235, 362 
polar coordinates, 177, 191 
Sperher , 332 

Spheroidal coordinates, oblate, 182 
prolate, 180 

Spin angular momentum, 403 
coordinate, 403 
degeneracy, 409 
displacement operators, 407 
energy, 408 
function, 405 
in two-body problem, 422 
matrices, 405, 566 
operator, 405 
vector, 405 

Spinning electron, 402 
Spread of measurements, 349 
Standard deviation, 349, 437 



602 


INDEX 


Stark effect, 186, 391 
State 

quantum mechanical, 316 
time-dependent, 393 
function, 336 

intuitive meaning of, 343, 347 
Stationarity condition, 200 
Stationary path, 198 
states, 337 

Statistical mechanics, 431 
Steepest descents, method of, 459 
Stehle, 282 
Steiner , 1, 31 
Step function, 239 
Stirling’s formula, 436 
interpolation formula, 470 
theorem, 97 
Stokes’ theorem, 157 
Strain , 161, 169, 170 
Stratton, 119, 256 
Strength, field, 161 
Stress, 161 

String, homogeneous vibrating, 542 
Sturm-Liouville equation, 534, 538 
theory, 267, 280, 342 
Subgroup, 546 
conjugate, 548 
invariant, 548 
Sublimation, 38 
Subtraction of matrices, 306 
Sum of state, 452 
of tensors, 164 
of vectors, 141 
Surface, coordinate, 172 
element, 173, 192 
integral, 156 
tension, 39 

Suspension bridge, 57 
Symbol, Christoff el three-index, 167, 
196 

Symmetric eigenfunctions, 416 
group, 558 
kernel, 528 
matrix, 310 
state function, 455 
tensor, 164 
top, 368 

System, conservative, 283 
thermodynamic, 442 

Tallquist , 136 


Tamar kin, 245 
Taylor , 465, 478, 492 
Taylor series, 102 

Taylor series method for differential 
equations, 483 
Temperature, 29 
Tension strain, 170 
Tensor, associated, 167 
component of, 162 
contraction of, 166 
contra variant, 163 
co variant, 163 
covariant derivative of, 169 
differentiation of, 167 
first rank, 162 
length of, 166 
mixed, 163 
product, inner, 166 
product, 165 
skew-symmetric, 164 
symmetric, 164 

Tensor notation, differential operators 
in, 195 

divergence in, 196 
gradient in, 196 
Laplacian in, 197 
curl in, 197 

Tensors in curvilinear coordinates, 192 
difference of, 164 
perpendicular, 166 
sum of, 164 
Ter Haar , 466 
Theorem, Gauss’, 159 
Green’s, 161 
of divergence, 159 
of residues, 457 
Stokes’, 157 

Thermal conductivity, 35 
flux, 151 

Thermodynamic derivatives, 15 
potential, Gibbs, 14 
relations, 450 
system, 442 
variables, 1 
laws of, 11 
second law of, 13 
Thomas , L. H. } 491 
Thomas , 7\ Y. t 171 
Thomson , 207 

Three-index symbol, Christoffel, 167 
Time-dependent states, 373 



INDEX 


603 


Titchmarah , 252, 267 
Tobolaky , 17 
Toeplitz , 543 
Tolman , 431, 443, 466 
Top spherical, 371 
symmetrical, 368 
Toroidal coordinates, 190 
Torrance , 59, 156 
Total differentials, 3, 8 
Trace, 554 
of matrix, 308 
Trambarulo , 300 

Transcendental equations, numerical 
solution of, 491 

Transformation, collineatory, 317 
congruent, 317, 322 
conjunctive, 317 
linear, 314 
orthogonal, 324 
principal axis, 326 
real orthogonal, 317 
similarity, 317, 318 
unitary, 317, 564 

Transform of a group element, 547 
of a matrix, 317 

in solving differential equation, 263 
Transients, 43 
Transition probability, 402 
forbidden, 402 
Translation, 171 
Translation of a molecule, 291 
of a rigid body, 285 

Transmission coefficient of barrier, 357 
Transparency factor of a potential 
barrier, 358 

Transpqsed matrix, 308 
Transposition, 558 
Trapezoidal rule, 477 
Triple product, scalar, 146 
vector, 147 
Triplet states, 423 
Tschebyscheff polynomial, 74, 132 
differential equation, 73 
Tunell, 12 
Tunnel effect, 356 
Turnbull , 316, 322 

Two-body problem in quantum me- 
chanics, 413 

Two-sided transformation, 259 
Uhlenbeck , electron spin, 403 


Umbral index, 162 

Uncertainty in angular momentum, 350 
Uncertainty principle, 348, 394 
Unimodular unitary group, 563 
Unitary group, 562 
matrix, 310, 330 
transformation, 317, 564 
Unit element in group theory, 545 
matrix, 308 
vectors, 140, 174 
Urey , 429 
Uspensky, 519 

Valence bond coordinates, 300 
Value of a physical quantity, most 
probable, 504 
true, 504 

Van der Waals* equation, 5, 24 
Van der Waerden, 585, 586 
Van Vleck , 392. 429 
Variable, canonically conjugate, 284 
cogredient, 318 
contragredient, 317 
extensive, 1 
independent, 32 
intensive, 1 
thermodynamic, 1 

Variables, method of separation of, 176 
Variation, 199 

Variation theory of eigenvalue prob- 
lems, 270 

Variational method, 377 
Variations, calculus of, 198 
Vector area, 144 
axial, 165 
column, 305 
components of, 137 
contra variant, 162 
covariant, 163 
curl of, 152 

curvilinear component of, 174 

differentiation of, 148 

divergence of, 151 

field, 149 

integration, 154 

irrotational, 154 

length of, 137 

magnitude of, 137 

operator, 174 

origin of, 137 

polar, 164 



604 


INDEX 


Vector area, product, 142 
pseudovector, 165 
radius, 153 
row, 305 
solenoidal, 154 
space, Hermitian, 328 
linear, 310 
sum, 141 
terminus of, 137 
triple product, 147 
unit, 140, 174 
Vectors, base, 193 
difference of, 141 
linear independence of, 311, 312 
orthogonalization of, 312 
products of three, 146 
reciprocal, 193 
scalar product of, 141, 311 
Velocity, absolute, 289 
angular, 145, 285 
linear, 145, 285 
of following, 290 
potential, 217, 224 
relative, 289 
Vena contracta, 34 

Vibrating sphere, with node at surface, 
258 

string, 208, 247 
Vibration problems, 542 
Vibrational energy of a molecule, 

294 

frequency of a molecule, 296 
Vibrations, forces, 542 
normal mode of, 296 
of a molecule, 290 
V ivanti-Schwank, 544 
v. Karman , 270 
v . Mises, 216, 244, 267 
v. Neumann , 429 
Volume element, 173 
integrals, 156 

Vol terra’s integral equation, 520 

Wade, 171, 332 
Walter, 430, 576, 585 
Watson, 80, 96, 136, 180, 543 
Wattless current, 56 
Wave equation, 212, 228 
Schrodinger, 176 
space form of, 230, 231 


Wave length, 230, 247 
Wave length, number, 230 
packets, 396 
Waves 

in one dimension, 231 
in two dimensions, 231 
in three dimensions, 232 
Waves, monochromatic, 235 
plane, 229 
spherical, 230 
standing, 230 
vs. particles, 335 
Weatherburn, 171 
Weaver, 180 
Webster, 245 
Weddle’s rule, 478 

Weierstrass, definition of gamma func- 
tion, 95 
p-function, 180 

Weight of measurement, 514 ' 

Wentzel, 215 
Weyl, 330, 582, 586 
Wheatstone bridge, 314 
Whittaker, 80, 96, 180, 282, 287, 326, 
495, 497, 519, 543, 566 
Widder, 267 
Wiener 267 

Wigner \ 287, 328, 560, 562, 564 ff., 582, 
586 

Willers , 519 

Wilson, E . B., 141, 163, 193 
Wilson, E. B., Jr., 177, 299, 320, 367, 
382, 387, 429, 519 
Wintner , 305 
Wolfsohn , 119 
Work content, 13 
in thermodynamics, 10 
mechanical, 142 
Wronskian, 33, 134 
Wu, 296 

Yoe, 504 
Youden, 519 

Zachariasen, 586 » 

Zassenhaus, 586 

Zeeman effect, normal, 392 

Zemansky , 1 

Zernike , 437 

Zonal harmonic, 66 



TABLE A.1 CONVERSIONS, CONSTANTS, AND FORMULAS 


- Volume and Weight 

1 U. S.gallon= 8.34 lbs xSp Gr 
1 U. S. gallon=0.84 Imperial gallon 
1 cu ft of liquid=7.48 gal 
1 cu ft of liquid— 62.32 lbsx Sp Gr 
Specific gravity of sea .water= 1 .025 
to 1.03 

1 cu meter=264.5 gal 
1 barrel (oil) =42 gal 


Power and Torque 

1 horsepower=550 ft-lb per sec 

= 33,000 ft-lb per min 
= 2545 btu per hr 
= 745.7 watts 
= 0.7457 kilowatts 


bhp = 


gpm x Head in feetx Sp Gr. 
3 960 x efficiency 


Capacity and Velocity 

1 gpm=449 cu ft per sec 


bhp = 


gpm x Head in psi 
1 7 14x efficiency 


gpm= 


lbs per hour 
500 x Sp. Gr. 


gpm=0.069x boiler Hp 
gpm=0.7 x bbl/hour= 0.0292 
bbl/day 

gpm=0.227 metric tons per hour 
1 mgd= 694.5 gpm 
y_ gpmx 0.321 _ gpm x 0.409 

area in sq. in J 32 

V=^/2gH 

gpm= gallons per minute 
Sp Gr= specific gravity based 
on water at 62°F 
Hp=horsepower 
bbl= barrel (oil) = 42 gal 
mgd= mill ion gallons per day 
of 24 hours 
V= velocity in ft/sec 
D=diameter in inches 
g=32.16 ft/sec/sec 
H=head in feet 


Head 


Head in feet 


Head in psi x 2.31 


Sp Gr 

1 foot water (cold, fresh)= 1.133 
inches of mercury 
1 psi=jD.0703 kilograms per sq 
centimeter 

1 psi=0.068 atmosphere 
V 2 

H =2F 

psi= pounds per square inch 


Torque in lbs feet= Hp r x p ^ 252 

bhp= brake horsepower 
rpm= revolutions per minute 

Miscellaneous Centrifugal Pump 
Formulas 

Specific speed = N s ^ m 
where H = head per stage in feet 

Diameter of impeller in 

^ 1840 Ku */H 

inches =a = ^ 

where Ku is a constant varying with 
impeller type and design. Use H at 
shut-off* (zero capacity) and Ku is 
approx. 1.0 

At constant speed : 

dx _ gpmi _ VHx _ y^bhpi 
d2 8P m 2 y/H 2 ^bhp 2 

At constant impeller diameter 

rpmi _ gpmi __ VHi __ ^Bhpi 
rpra 2 gpin 2 ^/Bhp 2 


TABLE A.2 MEASUREMENT CONVERSION 



Atmosphere 14.7 pounds (English) Inch 39,540 } wave lengths of Pint.. 

14.223 pounds (Russian red ray of cadmium 

25.4 millimeters 


.0.4732 liter 
16 fluid ounces 


Btu (British Thermal Unit) 778 foot pounds 
0.2930 watt hour 
0.252 calorie 


Calorie 1 kilogram of water raiseo 

‘ 1 degree Centigrade 

3.97 Btu 


Kilogram 2.2046 pounds 

32.274 ounces 
15432.36 grains 
0.0011 short ton 
0.00098 long ton 


Centare (square meter) ....10.764 square feet 

Centimeter 0.3937 inch 

Cheval (French hp-) 0.986 horsepower 


Cubic Centimeter 

(milliliter) 


0.061 cubic inch 


Kilogram per Cubic 

Meter 0.0624 lbs per cu ft 


Kilogram per Square 

Centimeter.... 14.225 lbs per sq. in 


Kilogram per Square 
Metor 0.205 lbs persq ft 


Kilometer 1,000 meters 

0.621 mile 


Pound Avoirdupois 16 ounces 

7,000 grains 
454 grams 
0.454 kilogram 
14.58 troy ounces 


Pound per Cubic 

Foot- ... 16.02 kilogram per 

cubic meter 


Pound per Sq in 2.3) feet head of water at 

1.00 sp gr 

0.0703 kilogram persq 
centimeter 


Pound per Sq Ft 4.88 kilogram per square 

meter 


..2 pints 
i gallon 
0.946 liter 


Stere 1 cubic meter 


Cubic Foot..... 1,728 cubic inches 

7.48 gallons 
60 pints 
8/10 bushel 

62.32 lbs water (62°F) 

1.000 ounces of water, Kilowatt 1.34 horsepower 

a PP r °x- tl'lVJ 1 lb P er ” un “ lc Square Centimeter 0.155 square inch 

0.028 cubic meter 56.87 Btu per m.nutc 

28.32 liters 


Cubic Inch 16.39 cubic centimeters Liter 1.000027 cubic decimetei! 

1.057 quart 


Square Foot 0.093 square meter 

144 square inches 


Cubic Meter 35.315 cubic feet 

1.308 cubic^yards 


Cubic Yard 27 cubic feet 

0.765 cubic meter 


Decimeter 3.937 inches 


Foot 12 inches 

0.385 meter 


Foot Pound 0.1364 kilogrammeter 


Gallon 231 cubic inches 

4 quarts 
8 pints 
3.785 litfers 
128 fluid ounces 
8.33 pounds of water 


Gallon per Minute 449 cubic feet per second. 


Gallon (British Imperial) 277.3 cubic inches 
1.201 U.S. gallons 
10 lbs water at 15®C. 
4.546 liters 


Gram.... .......15.43 grains 

0.0353 ounce 
0.0022 pound 


Horsepower 33.000 ft lb per minute 


1.014 cheval 
746 watts 


0.264 gallon 
61.02 cubic inches 
.035 cubic feet 
33.8147 fluid ounces 


Square Inch 6.452 square centimeters* 



270.518 fluid drams 

Square Kilometer 

..0.386 square mile 

Liter per Second 


Square meter (centare) 

10.764 square feet 

0.474 U.S. Gal per min 


1.196 square yard 



Square Mile 

.640 acres 


39.37 inches 

3,097.600 square yards 


3.28 feet 


2.59 square kilometers 


1.09 yards 

Square Millimeter....... 

...0.00155 snuare inch 

Metric Ton 

1.1023 short tons 

Square Yard 

...0.836 square meter 

Mil 


Stone (British) 

• 14 pounds 


25.4 microns 

0,0254 millimeter 


6.35 kilograms 



Ton (short) 

-.2,000 pound 

Jlrflle 



907 kilograms 


5,280 feet 

1.61 kilometers 

Ton (tang) 

..2,240 pounds 

1,016 kilograms 

270 gallons 

Ounce 

0.911 troy ounces 




28.35 gram 

Ton per Hour (metric) 

4.4 gallons per minute 

Ounce (Fluid) 

29.573 milliliters 

Tonne (metric) 

.1,000 kilograms 
2204.62 pounds 

Ounce (Fine) 

Troy ounce 

Vflf H .... 

^ feet 


480 grains 


36 inches 


31.104 grams 


0.914 meter 





